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Abstract 


The  aim  of  this  Lecture  Series  is  to  show  how  computer  assisted  translation  (CAT)  can  be  of  benefit  not  only  to  information 
managers  but  also  to  end-users.  Existing  systems  will  be  described  as  well  as  the  nature  of  the  texts  to  be  processed,  the 
technical  and  human  problems  related  to  the  use  of  such  systems  and  the  needs  of  end-users  (quality  level  of  translations, 
information  acquisition  in  the  mother  tongue. . .).  Examples  of  on-going  applications  and  systems  under  development  will  also 
be  presented.  These  examples  will  highlight  the  benefits  documentation  centres  will  derive  Irom  CAT  and  suggest  solutions  of 
interest  to  the  end-user. 


This  Lecture  Series,  sponsorcjl  by  the  Technical  Information  Panel  of  AGARD,  has  been  implemented  by  the  Consultant  and 
Exchange  Programme. 
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Abrege 


Ce  Cycle  de  Conferences  a  pour  but  dc  montrer  I’interet  que  peut  apportcr  la  Traduction  Assistcc  par  Ordinateur  (TAO)  non 
sculcment  pour  lc  rcsponsablc  d’un  Centre  d’Information,  mais  cgalcmcnt  pour  l’utihsatcur  final.  Aprcs  avoir  defini  les 
systemes  cxistants,  la  nature  des  textes  a  trailer,  les  problemcs  techniques  ct  humaius  lies  a  I’utilisation  des  systemes  ct  les 
besoins  des  utilisatcurs  finaux  (qualitc  des  traductions,  connaissance  de  I'information  dans  la  langue  materncllc...),  dcs 
exemples  duplication  cn  cours  ou  cn  dcvcloppement  seront  presentes.  Ces  diverscs  applications  permettront  de  degager 
l’interet  que  pourront  cn  tircr  les  Centres  dc  Documentation  ct  de  proposer  des  solutions  au  benefice  de  I’ut'lisatcur  final. 

Cc  Cycle  dc  Conferences  cst  presente  dans  lc  cadre  du  Programme  dcs  Consultants  et  dcs  Echanges,  sous  1’cgide  du  Panel  dc 
I’information  Technique  de  1’AGAUD. 
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TYPOLOGY  '  EXISTING  SYSTEMS 


Ian  X.  Pigott 
Head  of  Systran  Project 
Commission  of  the  European  Communities 
Jean  Monnet  Building  B4/2JA 
L-2920  Luxembourg 


Abstract 

Various  attempts  have  been  made  at  defining  a  typology  01  MT  systems,  somo  based  on 
generations  of  software  and  hardware  developments,  othorn  on  the  nature  of  the 
translation  process  (e.g.  direct,  transfer,  modular).  Today,  however,  a  classification 
based  core  generally  on  performance  and  user  access  would  appear  to  be  more  appropriate. 
The  paper  will  thus  distinguish  between  large  software  packages  installed  on  mainfraao 
computers  for  access  by  telecommunications  and  smaller  PC  packages  functioning  on  MS-DOS 
equipment.  Attention  will  also  be  given  to  systems  capable  of  dealing  with  limited 
vocabulary  and  syntax  as  well  as  to  developments  in  Japan  which  are  beginning  to  sot  now 
trends  in- MT  technology.  Finally,  information  will  be  presented  on  how  systems  are  now 
being  used-in  practice  and  how  use  is  likely  to  evolve  over  the  next  decade, 


Introduction 

The  typology  of  machine  translation  systems  has  been  discussed  and  rediscu3sed  over  the  past  ten  to 
fifteen  years.  Initially,  suppliers  and  research  centres  tended  to  equate  the  maturity  of  their 
developments  in  term3  of  "software  generations”  in  much  the  same  way  a3  computer  suppliers. 
Distinctions  based  on  generations  became  less  and  less  meaningful  as  time  went  by,  particularly  as 
some  approaches  labelled  second,  third  or  even  fourth  generation  proved  loss  reliable  in  practice  than 
earlier  developments  which  had  continued  to  mature. 

John  Hutchins  In  "Machine  Translation  -  past,  present,  future"  bases  his  typology  on  the  naturo  of  tho 
translation  process  itself.  He  thus  distinguishes  between  direct  (bilingual),  interlingual,  transfer 
and  semantics-based  systems.  The  problem  here  is  that  practically  all  major  developments  have  tended 
to  progress  along  similar  lines.  Systems  which  originally  took  a  direct  or  bilingual  approach  have 
since  ovolved  into  Interlingual  or  even  transfer  systems  while  "semantics-based"  systems  have  begun  to 
give  additional  attention  to  many  of  the  syntactic  criteria  adoptod  in  earlier  developments. 

It  is  for  the  abovo  reasons  that  in  presenting  ny  own  ideas  on  MT  typology,  I  shall  give  more  emphasis 
to  performance,  inprovability  and  ur  r-Mriondliness  than  to  distinctions  in  the  linguistic  make-up  of 
systems.  Users  are  after  all  more  1. ,t-H ekfvd  in  how  well  a  system  can  do  the  job  than  in  how  tho  Job 
is  actually  done. 


Existing  systems 

Most  of  tho  systems  in  current  use  originated  in  the  United  States  in  the  sixities  and  seventies.  Taey 
fall  into  two  basic  categories:  the  larger,  more  complex  systems  such  a3  Logos,  Spanam  and  Systran 
which  are  normally  installed  on  centralized  mainframe  computers  and  can  bo  accessed  by 
telecommunications,  and  lees  sophisticated  products  such  as  Smart,  Clobalink,  Linguistic  Products  and 
Weidner  which  run  on  personal  computers  or  workstations  at  thu-user  site.  This  second  category  should 
however  not  be  underestimated  since  in  the  language  software  industry,  as  in  other  areas,  there  i3  a 
general  tendency  for  desktop  applications  to  evolve  rapidly  on  the  basis  of  user  requirements. 

Recent  newcomers  to  the  user  market  include  Metal  and  Tovna.  Metal  was  originally  developed  by  the 
University  of  Austin  in  Texas  and  is  currently  supported  by  Siemens,  Munich.  The  system  now  runs  on 
Unix  and  extensions  from  the  original  German-English  are  being  made  to  cover  Spanish,  French  and 
Dutch.  Tovna,  another  Unix-based  system,  is  being  developed  in  Jerusalem  and  has  already  Deon 
installed  at  several  user  sites  for  English-French. 

Finally,  over  the  past  couple  of  years  a  number  of  Japanese  systems  have  reached  the  marketplace, 
mainly  for  Japanese-Eriglish  and/or  English-Japanese.  However,  Fujitsu's  Atlas  system  is  already  being 
extended  to  European  language  combinations  and  other  Japanese  manufacturers  aro  likely  to  follow  this 
trend.  Given  the  enormous  investments  now  being  made  by  all  the  large  Japanese  companies  in  machine 
translation  and  related  technologies,  products  from  Japan  arc  likely  to  dart  ponotrating  the  European 
and  US  markets  within  the  next  couple  of  years. 
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Quality  of  output  depends  very  nuch  on  the  language  pairs  involved,  the  type  of  document  and,  of 
course,  the  coverage  of  technical  terminology.  It  often  happens  that  a  given  product  will  provide  a 
reasonable  level  of  quality  for  one  language  pair  and  far  less  oatisfactory  results  for  another.  As  a 
general  rule,  developments  involving  the  Latin  languages  (French,  Spanish,  Italian  and  Portuguese)  and 
English  tend  to  produce  rather  better  results,  for  a  given  amount  of  investment,  than  systems 
involving  Germanic  or  Asian  languages. 

But  language  coverage  in  MT  systems  is  now  generally  very  good.  Most  operational  systems  cover  French 
and  English  in  both  directions  and  most  also  have  German  and  Spanish  as  either  a  source  or  target 
language.  English  is  undoubtedly  the  moat  highly  developed  source  language  with  a  wide  range  of 
targets  such  as  Spanish,  Butch,  Portuguese,  Banish,  Svedish,  Japanese  and  Arabic.  Russian,  an  old 
favourite  in  the  1960s,  is  regaining  attention  along  with  Chinese  and  Korean  which  have  joined  the 
club  more  recently. 

User  requirements 


User  requirements  fall  basically  into  two  categories:  information  assimilation  and  information 
dissemination,  although  of  course  there  are  grey  areas  between  these  two. 

-  Information  assimilation  can  be  described  as  the  gathering  of  information  from  Internal  or  external 
sources  for  general  use  by  an  individual  in  keeping  abreast  with  evolving  policies,  markots  or 
technical  advances: 

-  Information  dissemination  covers  the  whole  process  of  communicating  or  publishing  documents  for 
(often  unidentified)  third  parties. 

Examples  of  documents  used  or  translated  for  purposes  of  information  assimilation  are  pres3  reports, 
documentary  data  bases,  technical  reports  from  consultants  or  from  industry  in  general.  Tho  reader's 
main  aim  is  to  understand  the  message  of  the  documents  in  question  and  he  will  thus  usually  accept 
comparatively  lower’ standards  of  translation.  Very  often,  in  this  context,  speed  and  low  cost  are  of 
primary  importance. 

In  regard  to  assimilation,  the  United  States  Air  Force  have  used  Systran  since  1970  to  translate  first 
from  Russian  and  later  from  French  and  German  into  English.  The  documents  covor  a  wide  range  of 
technical  sectors  and  user  satisfaction  is  said  to  be  high.  In  Europe,  the  Nuclear  Research  Centre  in 
Karlsruhe,  West  Germany,  has  a  similar  application  involving  the  translation  of  French-language 
research  prpors  into  English.  At  the  European  Commission  too,  use  of  raw  machino  translation  for 
information  purposes  has  been  steadily  increasing  over  the  past  couple  of  years,  particularly  in  canes 
where  users  are  unable  to  obtain  human  translations  within  the  time  available. 

As  for  information  dissemination,  documents  currently  being  submitted  to  MT  include  not  only 
maintenance  manuals  and  technical  reports  -  which  in  many  cases  appear  to  be  idoally  suited  to  the 
technology  -  but  policy  papers,  administrative  documents  and  even  journal  articles. 

In  most  cases,  translation  quality  for  dissemination  needs  to  bo  high  and  in  some  cases  it  needs  to  be 
excellent.  Hero,  machino  translation  can  often  be  used  as  a  basis  for  human  editing  up  to  the  required 
acceptable  standard.  Particularly  when  texts  are  repetitive  and  rich  in  technical  terminology,  machino 
translation  can  be  a  useful  aid  in  reaching  top  quality  standards. 

By  far  the  most  common  and  successful  application  of  machine  translation  for  dissemination  or 
publication  purposes  is  indeed  the  translation  of  maintenance  manuals..  Most  MT  systems,  both  large  and 
small,  are  being  usod  in  this  way.  Largo  corporations  such  as  Xerox,  IBM  and  Siemens  hevo  already 
achieved  quite  a  record  of  success,  while  small  hardware  and  software  suppliers  are  now  beginning  to 
report  encouraging  results  with  desktop  MT  software. 

The  be3t  results  here  Involve  a  combination  of  careful  source  document  preparation,  a  dependable  level 
of  technical  terminology  in  the  MT  system,  and  human  post-editing.  The  major  advantages  are  not  Just 
speed  and  cost  but  consistency  of  terminology  which  provides  for  more  Immediate  intelligibility. 

In  the  public  sector  too,  institutions  such  as  NATO,  some  of  the  UN  agencies  and,  of  course,  the 
European  Commission  Itself  are  also  making  use  ofMT  to  translate  technical  reports,  administrative 
documents  and  minutes  of  meetings.  Raw  MT  quality  is  sometimes  adequate  for  user  requirements  and  in 
many  cases  rapid  post-editing  (at  a  rate  of  say  four  pages  per  hour)  provides  acceptable  results. 
Post-editing  is  normally  carried  out  by  translators  but  there  is  Increasing  evidence  that  engineers  or 
other  subject-field  experts  can  also  produce  good  results. 

Finally,  use  of  machine  translation  via  public  netw  /king  facilities  is  beginning  to  have  a 
considerable  impact’.  In  France,  it  is  already  being  used  in. significant  volumes  on  tho  Mlnltol  network 
where  Gachot  S.A.  provides  a  number  of  on-line  services  using  tho  Systran  system.  In  Canada,  the  Smart 
system  is  being  used  by  the  Eepartment  of  Employment  to  translate  'job  descriptions  between  English  and 
Fronch  for  coast-to-coast  access.  In  Europe,  experiments  are  already  underway  to  combino  multilingual 
database  interrogation  packages  with  machine  translation  in  order  to  provide  the  non-spocialist  with 
rapid  and  reliable  means  of  accessing  foreign  language  databases. 


Vhat  remains  to  be  done? 


Machine  translation  can  hardly  be  regarded  as  a  technology  in  its  own  right.  For  it  to  be  used 
successfully  by  the  non-expert,  much  remains  to  be  done  to  overcome  many  of  the  technical  problems 
which  often  outweigh  its  advantages. 

On  the  one  hand,  there  is  the  problem  of  document  preparation.  The  non-expert  user  sitting  at  his  PC 
or  Mini  tel  terminal  knows  nothing.  of  the  woil'ings  of  the  translation  software.  He  is  unaware  of  the 
fact  that  a  spelling  error,  missing  punctuation  or  non-standard  formatting  will  lead  to  translation 
errors. 

Here  progress  can  be  made  at  two  levels.  On  the  one  hand,  spelling  correction  technology  can  be 
integrated  in  the  automatic  interface  to  the  MT  system  while  on  the  other,  a  degree  of  online  screen 
editing  can  be  introduced  to  draw  the  user's  attention  to  syntactic  and  even  semantic  problems  in  his 
draft.  This  type  of  technology  is  developing  quickly  but  improvements  in  user-friendliness  are  called 
for. 

In  addition,  as  companies  with  large  multinational  requirements  become-  more  aware  of  the  cost  of 
translation  activities  (which  can  extend  to  10^  of  production  costs),  it  is  probable  that  they  will 
pay  more  attention  than  in  tho  past  to  document  drafting.  The  editing  or  critique  software  packages 
now  on  the  market  are  dosgined  to  discipline  authors  and  their  secretaries  in  the  use  of  vocabulary 
and  syntax  in  order  to  reduce  to  a  minimum  the  possible  ambiguities  in  a  source  text.  This  approach 
makes  not  only  for  better  comprehension  in  the  source  language  itself  but  for  quicker  and  more 
reliable  translations.  Above  all,  source  texts  drafted  along  these  linos  are  far  more  suitable  for 
machine  translation  than  undisciplined  drafts. 

Several  companies  have  already  adopted  this  strategy,  particularly  in  connection  with  maintenance 
manuals.  Extensions  to  other  types  of  document,  for  example  report  writing,  cv*  be  expected  to  follow 
soon. 


Current  trends 


Over  the  past  year,  we  have  seen  a  number  of  encouraging  extensions  to  the  machine  .translation  market. 
Logos,  Metal,  Systran  and  Tovna  have  all  been  successful  in  finding  new  customers  while  sales  for 
desktop  packages  such  as  those  supplied  by  Linguistic  Products  also’  appoars  to  be  on  the  rise. 
Extensions  to  new  language  pairs  have  kept  pace  with  the  applications  side  although  now,  as  in  the 
past,  there  has  been  a  tendency  to  oversell  all  new  extonsions  and  developments. 

Some  MT  packages,  though,  have  been  the  victims  of  restructuring  or  new  company  policy.  Alps,  who 
still  support  their  computer-assisted  translation  packages,  hove  concentrated  their  efforts  on 
translation  services  in  general,  particularly  through  the  acquisition  and  networking  of  a  number  of 
large  translation  bureaux.  Weidner,  which  had  a  number  of  XT  packages  for  European  language  pairs  on 
PCs,  appears  to  have  discontinued  reliable  support  after  being  taken  over  by  the  Japanese  company 
Bravice.  Bravice  itself,  on  the  other  hand,  seems  to  be  making  considerable  progress  with 
Engllsh-Japanese  and  Japanese-English  versions  of  the  software. 

The  Canadian  MT  market,  in  particular,  appears  to  be  expanding.  Logos,  Smart  and  Tovna  all  have 
applications  there  for  English- French,  mainly  in  connection  with  translation  projects  supported  by 
government  funding.  However,  the  ambitious  four-million-dollar  Gigatext  project  supported  by 
Saskatchevan  seems  to  have  run  into  serious  difficulties. 

Systran  has  been  used  more  extensively  by  NATO,  Xerox,  the  US  Air  Force  and  on  the  Cachot  Minitel 
network.  The  European  Commission  has  brought  the  system  on  line  for  internal  users  (25,000  pages 
translated  in  1989)  and  is  embarking  or.  major  applications  of  the  software  for  the  translation  of 
patent  literature  in  collaboration  with-  the  European  Patent  Office. 

Last  but  not  least,  the  Japanese  giants  who  nearly  all  have  MT.  developments  have  continued  to  make 
progress  on  the  applications  side.  Several  systems  are  now  operational  for  English- Japanese  and 
Japanese-English  although  hard  statistics  on  actual  users-are  difficult  to  obtain.. 


Progress  on  MT  research 

Over  tho  past  few  years  there  has  been  a  steady  increase  in  the  MT  research  sector.  As -we  have  already 
seen,  the  moot  notable  developments  have  been. in  Japan  where  all  the  large  computer  manufacturers  are 
developing  systems  Cor  Engllsh-Japanese  and  Japanese-English  and  to  some  extent  for  other  language 
combinations.  The  most  successful  to  date  appears  to  be  Fujitsu  with  its. Atlas  systems. 

In  Europe,  the  major  research  project  continues  to  be  Eurotra  cofinanced  by  the  European  Community  and 
its  Member  States.  It  was  originally  hoped  that  pilot  systemh  for  all  tho  European  languages  would 
become  operational  by  the  end  oi  1S90  but  this  goal  is  proving  more  and  more  difficult  to  achieve. 
Purotra  objectives  for  the  future  are  likely  to  be  based  more  on  providing  a  range  of 
language-processing  products  for  the  various  EC  Member  States  than  on  MT  alone. 
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Other  projects  in  Europe  include  DLT  (Distributed  Language  Processing)  in  the  Netherlands,  which  is 
based  on.  the  use  of  Esperanto  as  a  pivot  language,  and  Rosetta  -  supported  by  Philips  in  the 
Net.  "le-  da  -  which  is  expected  to  produce  the  first  oporational  results  in  1990  in  systems 
transiting  between  English,  Dutch  and  Spanish. 

In  the  United  States,  IBM  has  once  again  become  involved  in  MT  development,  mainly  for  the  translation 
of  its  own  technical  documentation.  A  number  of  European  universities  and  research  centres  are 
involved  in  their  LMT  (Logic-programming-based  Machine  Translation)  project  with  the  development  of 
prototype  versions  covering  English,  Danish,'  French,  German  and  Spanish. 

One  of  the  developments  which  cculd  provide  interesting  results  in  the  medium  term  is  the 
Carnegie-Mellcn  Knowledge-Based  Machine  Translation  project.  As  its  name  implies,  the  project  is  aimed 
at  using  artificial  intelligence  to  resolve  natural  language  ambiguities .  As  the  cost  of  such 
developments  is  very  high,  even  for  a  narrow  subject  area,  the  project  could  well  run  into  financial 
difficulties.  The  approach  itself  is,  however,  quite  an  interesting  one. 

By  and  large,  though,  MT  research  results  have  been  rather  disappointing.  Some  large  projects  such  as 
Calliope  in  France  have  been  terminated.  The  Japaneae  systems  have  proved  more  difficult  to  develop 
than  originally  anticipated  and  Eurotra  has  suffered  from  difficulty  i'  coordinating  developments  in 
the  various  countries  concerned. 

With  the  possible  exception  of  Tovna,  the  result  has  been  that  more  traditional  approaches  to  MT  have 
been  generally  more  successful  than  Innovative  strategies. 


Selection  of  a  system 


In  my  introduction  I  pointed  out  that  the  most  Important  aspect  of  a  typology  of  machine  translation 
was  to  assist  the  user.  I  have  now  given  an  overview  of  current  developments  and  prospects  for  the 
future  but  perhaps  for  many  it  is  not  a  very  good  basis  for  choosing  an  MT  system  for  practical 
application. 

One  of  the  key  questions  is,  of  course,  "Are  you  principally  concerned  with  publishing  information  or 
with  collecting  information?" 

If  you  need  to  publish  information,  you  are  probably  already  employing  translators  (either  in-house  or 
under  contract)  to  ensure  that  your  quality  requirements  are  met.  If  you  decide  to  turn  to  machine 
translation,  you  will  no  doubt  wish  to  maintain  similar  standards. 

The  criteria  you  should  look  at  most  closely  in  choosing  an  MT  system  can  be  summarized  as  follows: 

-  Has  the  system  already  been  developed  for  the  languages  and  subject  areas  which  are  of  interest  to 
you? 

-  Can  the  supplier  provide  names  and  addresses  of  users  who  have  suffient  experience  of  the  system  to 
discuss  its  merits? 

-  What  additional  developments  (if  any)  will  be  necessary  to  bring  the  system  up  to  the  quality  you 
require  (at  whose  cost  andover  how  many  months)? 

-  How  easy  will  it  be  to  integrate  the  system  into  your  own  existing  technical  infrastructure? 

-  Can  you  take  action  to  improve  the  quality  of  your  source-language  documents  (particularly  important 
if  more  than  one  target  language  is  required)? 

-  What  measures  can  you  take  to  ensure  that  post-editors  will  indeed  be  able  and  willing  to  make 
efficient  use  of  the  system? 

As  I  may  have  implied,  the  cost  of  a  system  (whether  under  a  purchasing  or  licensing  agreement)  may 
not  be  the  key  factor.  Most  users  have  found  that  integration  and  further  development  costs  - 
particularly  on  dictionaries  -  are  likely  to  cost  far  more  than  the  initial  installation.  In  addition, 
it  might  well  prove  difficult  to  convince  translators  that  they  really  have  something  to  gain  from  the 
use  of  an  MT  system;  they  might  well  be  opposed  to  changing  working  methods  or  becoming  a  "slave"  to 
the  machine.  User-friendliness,  particularly  as  far  as  post-editing  is  concerned,  is  thus  of  the 
utmost  importance. 

If  you  are  primarily  interested  in  collecting  or  scanning  foreign  language  information,  then  your 
priorities  are  likely  to  be  rather  different.  These  might  be: 

-  Can  the  system  deal  with  a  wide  range  of  text  types  and  subject  fields? 

-  Is  the  quality  of  the  output  (for  your  language,  pairs)  readily  intelligible  without  human 
intervention? 
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-  Is  the  supplier  likely  to  provide  new  improved  versions  yhich  will  increase  the  level  of  performance 
you  need?  (Ae  a  user  of  raw  output,  you  are  less  likely  to  be  willing  or  able  to  participate  in 
system  improvement  than  tho  "publishers" . ) 

-  How  fast  is  the  turnaround  of  tho  MT  system? 

-  Poes  the  amount  of  material  you  need  to  scan  justify  the  investment? 

Both  groups  of  users  would  also  bn  well  advised  to  look  into  ways  and  means  of  installing  suitable 
peripheral  equipment  to  be  ured  in  connection  with  machine  translation.  This  might  include: 

-  optical  character  reading  for  inputting  hard  copy; 

•*  sophisticated  word  processing  software  for  text  preparation  and  any  pre-  or  post-editing; 

«  grammar  and  style  checkers; 

-  suitable  telecommunications  facilities  (if  required). 


The  future 


Over  the  next  ten  years,  machine  translation  is  likely  to  be  used  more  and  more  extensively, 
particularly  for  many  routine  types  of  translation  prqcessing  as  well  as  for  information  assimilation 
purposes.  Technical  documentation,  which  is  already  by  far  the  largest  source  of  translation,  will 
increasingly  be  submitted  to  MT  processing  as  the  drafting  of  source  material  improves. 

We  are  unlikely  to  seo  any  really  revolutionary  approaches  to  MT  processing.  Existing  systems  will 
continue  to  improve  with  experience  and  new  developments  will  tend  to  fall  tack  on  well-established 
processes  as  the  difficulties  of  programming  new  linguistic  strategies  arc  encountered  in  practice. 

The  main  users  will  be  multinational  corporations  and  international  organizations;  database  suppliers 
and  all  those  involved  in  the  on-line  information  industry  will  also  become  dependent  on  machine 
translation  as  the  largely  English-language  information  resources  come  into  multilingual  access  ana 
use. 

By  the  year  2000,  Japan  is  likely  to  be  the  main  supplier  of  MT  s.,  ms  and  services.  Europe  will 
continue'*  to  make  use  of  its  linguistic  heritage  in  extending  and  iopro  ,<.g  projects  originating  in  the 
United  States  and  Japan  but  it  is  questionable  whether  it  will  be  successful  in  da /eloping  any  major 
systems  of  its  own. 

Systems  will  become  more  user  friendly  as  improved  peripherals  are  introduced  whether  on  stand-alone 
systems  running  on  PCs  or  as  a  means  of  improving  access  to  larger  systems  via  telecommunications. 
Whatever  the  approach,  standardization  of  document  architecture,  telecommunications  protocols  and 
natural  language  character  sets  can  bo  expected  to  pavo  the  way  for  increased  integration  between  MT 
systems  and  peripheral  software  in  general. 

Input  technology  will  alao  have  a  major  impact  on  MT  use  as  optical  chars  reading  improves  and 
voice  technology  develops. 

Finally,  typology  itself  is  likely  to  evolve  once  more  as  market  forces  compete  on  two  basic  ironts: 
integrated  desktop  software  on  ever  more  powerful  machines  versus  machine  translation  services 
provided  by  tclecommications  from  remote,  but  ever  more  sophisticated  boats. 

Whether  or  not  it  will  be  possible  to  carry  out  machine  interpretation  between  various  languages  as 
voice  analysis  techniques  are  developed  for  automatic  dictation  still  remains  a  largely  unanswerable 
question.  Expectations  are  high,  particularly  in  Japan,  but  developments  -  as  in  traditional  MT  -  are 
taking  longer  than  expected. 


2-1 


L'environnement  technique 
de  la  traduction  asslstte  par  ordinateur. 
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Resume;. 

Paradoxalement ,  alors  que  lea  besoins 
declares  ou  potentiels  de  traduction  et 
d ' interpretation  sont  enormes  et  de  plus  en 
plus  pressants,  alors  que  des  progr6s 
spectaculaires  sont  realises  dans 
I'ensenble  du  secteur  des  technologies  de 
l *  information,  la  TfiO  senble  marquer  le 
pas,  et  paratt  m#me  en  regression  dans 
cer'ains  pays. 

La  raison  premiere  en  est  que  cet 
environneaent  technique  est  trap  souvent 
deplorable,  sous  tous  ses  aspects  et 
notaaaent  sous  l 'angle  de  l ‘interface 
hoaae-aachine  et  ergonnmie  des  systeaes. 

Pour  la  clarte  de  l 'expose  on  exaaine 
success ivement : 

-la  aaniere  dont  le  probleae  se  pose 
aujourd'hui:  rappel  de*  enjeux  et  des 
dlfferents  types  de  besoins  et  doaaines 
duplication  pouvant  conduire  A  des 
environneaents  differents. 

-I ’environneaent  technique  dans  la  phase 
recherche  et  developpeaent ,  ad  l 'on 
distingue  l *  inf  nraatique  et  aspects 
connexes  d'une  part,  la  linguistique 
d'autre  part,  cette  etape  conduisant  a 
I'etape  d  '  industrial isation  du  produit  ou 
d’une  version  du  produit. 

•  l '  nvironneaent  op6rationnel  oO  le 
sys.eme,  encore  bien  fragile  et- criticable, 
consider^  cependant  coame  ’diflni 
perfectible',  a  besoin  de  s'integrer  dans 
une  application  ou  chez  un  client.  Cette 
integration  sera  possible  si  un  certain 
noabre  de  conditions  sont  reaplies,  et 
notaaaent  si  dans  les  faits  le  traducteur 
est  rtelleaent  aide.  Celul-ci  peut  apporter 
beaucoup  dans  la  vie  operationnelle  du 
systeae  si  t'ergonoaie,  pour  ce  qui  le 
concerne,  est  appropriee,  grace  par  exeaple 
a  un  decoupage  judlcieux  des  taches,  afin 
qu'il  conserve  celles  qui  sont  noraaleaent 
de  son  ressort,  et  afin  qu'il  conserve  la 
responsabilite  de  l' ’oeuvre  seconde'  que 
constitue  la  traduction  definitive. 

La  conclusion  est  en  forae  d'une  serie  de 
reebaaandations  qui  resuaent  les  points 
sur  lesquels  il  faut  *tre  attentif  si  t'on 
veut  aaeiiorer  les  systeaes  actuels  et 
obtenir  une  acceptation-plus  grands  et  une 
performance  accrue  des  systeaes  futurs. 

ooOoo 


Les  enseioneaents  de  t'histoire  de  la  TfiO ? 
Un  serpent  de  aer? 

It  iaporte  de  garden  en  aeaoire  que 
t'histoire  deja  longue  et  chargee  de  la 
traduction  par  aachine  est  faite  d'une 
suite  de- proclamations  excessives 
optiaistes  cj  euphoriques  alternant  avec 
des  periodes  de  silences  et  d'oubli,  un 
peu  coaae  il  en  va  en  aatiere  d'QVNIs 
(objets  volants  non  identifies).  Un  tel 
parcours  a  eu  pour  effet  d'entaaer 
la  credibilite  des  utilisateurs 
potentiels  aussl  bien  que  des 
organisaes  qui  finangaient  les 
recherches.  La  figure  t  presents  les 
points-cies  de  cette  histoire.  Il  est 
interessant  de  noter  qu'on  s'est  arrache  a 
Munich  (Suaait  II  -  fioOt  1969)  le  rapport 
de  la  JEIOfi  (Japan  Electronics  Industry  . 
□evelopaent  fissociation)  intitule  'vision 
japonaise  de  la  TRO  a  la  luaiere  des 
considerations  et  des  recoaaandations  du 
rapport  RLPfiC*.  Les  japonais,  tirant  a  leur 
aaniere  eux  aussi  les  legons  de 
l 'histoire,  prennent  aujourd'hui  le 
contre-pied  du  rapport  PLPfiC  (automatic 
Language  processing  advisory  committee).  Le 
rapport  de  la  JEIDfi  se  londe  sur  1‘enDrme 
marche  de  La  traduction  au  Japon  pour 
recommander  des  investissements  massifs 
dans  ce  secteur. 

Il  est  interessant  de  noter  aussi,  dans  ce 
rappel  historlque,  que  le  langage  Prolog, 
qui  avail  ete  invents  en  France  pour  des 
besoins  de  traduction  automatique  a  6tr 
finalement  adopts  par  le  Japon  dans  les 
projets  lies  6  l 'intelligence  artificlelle 
et  s'est  generalise  dans  diverses 
applications. 

Elimination  des  traducteurs? 

Le  concept  de  traduction  automatique  qui 
sous-er.tendait  que  l’.  machine  allait 
apportvr  ta  solution  a  fait  place 
progressiveienl  6  la  notion  beaucoup  plus 
rSaliste  et  plus  mDdeste  de  traduction 
assistSe  par  ordinateur,  oil  l 'on  reconnait 
avec  un  peu  plus  d'humilite  que 
I'objectif  sora  moins  ambitleux  et  que  la 
solution  ne  pourra  etre  issue  que  de  la 
conjugaison  des  efforts  des  linguistes  et 
lnformaticiens,  chercheurs  et  promoteurs 
de  systSmes  d'une  part,  des  utilisateurs, 
notamment  chefs  d'entreprises  et 
traducteurs  d'autre  part  dont  on  ne  peut  se 
passer  pour  ta  phase  de  developpement  d'un 
produit  qui  restera  toujours  perfectible  et 
done  dependant  des  traducteurs. 

Probiemes  sous-estim6s. 

Ces  derniers  d’ailteurs  etaienl  restes 
tres  sceptiques  sur  les  rdsultats  A 
escompter  d'une  traduction  automatique  qui 
rEsoudrait  avec  une  Logique  binaire  les 
probiemes  tout  en  nuances  auxquels  ils  sont 
confrontEs,  probiemes  lies  par  exemple  aux 
figures  de  style  (voir  ci-dessous)  beaucoup 
plus  frtquentes  qii'on  ne  le  pense  ,  mSme 
dans  ta  langue  technique,  et  aux 
maladresses.  des  auteurs  qu'ils  ont  souvent 
A  aider  dans  des  demonstrations  maladroites 
exprimtes  dans  un  jargon  obscur  ou  ambigu 
(‘the  fish  found  dead  in  the  river  wilt  he 
replacedby  farmers*).  Dans  un  de  ses 
essais,  Eugene  Garfield  voyait  mime  dans 
t'a  redaction  une  fonction  A  laisser  A  des 
spEcialistes'  (  a  -job  for  professionals). 
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LES  FIGURES  DE  STYLE. 

la  svnecdoeue:  la  par-tie  pour  le  tout  et 
vice  versa.  Cfaire  de  la  voile) 

l 'anacoluthe :  .chanaeaent  brutal  aais  licite 
de  construction  graaaaticale 

l 'antonoaase:  eaploi  d'un  non  propre  au 
lieu  d’un  no*  coaaun.  C  Wall  Street  n'a  pas 
reagi.) 

I  *  image :  la  charrue  avant  les  borufs. 
Etoile  rouge  lies  soviets)  sur  la  grande 
bleue  (Mtditerran(e) . 

I  ‘analooie:  cette  affaire  est  un  serpent 
de  *er.  On  nous  *4ne  en  bateau. 

I ’ellipse:  onission  des  mots  qui  ne  sont 
pas  indispensables:  train  rentre  (pour 
train  d'atterrissage) 

la  wfetaohore:  transfert  de  signification 
CbrOler  de  dtsir). 

la  litote  (understatement):  il  s'est 
eteint,  pour  *il  est  aort*. 

les  archaisaes,  les  neolqgismes  non 
encore  homologuts,  les  jeux  de  mots,  les 
proverbes. . .et  tutti  quanti,  toutes  les 
expressions  et  idiotisaes  qui  ne  sont  pas 
forceaent  faciles  a  dtceler  Ctrop  c'est 
trop,  pour  enough  is-. enough)  et  puis  les 
faux-aais  (I  recognize  you),...  et  les 
h  aographes,  et  la  polyseaie,  et  les 
abreviations,  syaboles,  codes,  sigles  ou 
foraules  de  plus  en  plus  frequents 
notaaaenl  dans  la  langue  technique. . .Qn- 
voit  a  quel  point  le  parcours  est  seat 
d'eabQches  redoutables.  Et  ceci  sous-tend 
qu 'au  lieu  d'ignorer  les  traducteurs,  on 
aura  besoin  d'eiix  parce  qi’ils  connaissent 
bien  les  pieges  4  dtjouer  entre  langue 
source  et  langue  cible.  N'oublions  pas  par 
exeaple  que  Peter  Toma,  le  pire  de 
Systran,  etait  d'abord  polyglotte. 


Teiescooaoe  de  la  phase  develoooeaent  et  de 
l 'industrialisation. 

Cet  empressewent  des  chercheurs  4  annoncer 
des  rftsultats  et  des  succds  a  masque 
longteaps  le  fait  que  la  recherche  ne 
pouvait  pas  etre  directement  suivie  de 
l ‘application,  et  qu'il  ifallait 
ntcessairement  passer  par  une  longue  et 
rude  etape  de  dtveloppeaent , 
d 'apprent issage ,  avec  ,  en  desespojr  de 
cause,  le  concours  de  traducteurs 
connaissant  bien  la  langue  sourcp  et  la 
langue  cible,  puis  par  une  phase 
d'industrialisation  pour  aboutir  par 
exeaple  4  un  produit-  portable,  compatible 
avec  .les  ordinateurs  les  plus  couraaaent 
utilises,,  peraettant  une  utilisation 
interactive  avec  une  bonne  ergonoaie,  des 
temps  de  traiteaent  acceptables,  une 
prise  en  coapte  immediate  des  observations 
des  utitisateurs.  Sur  thus  ces  points  les 
progr^s  nnt  4te  et  sont  encore  tris  lents 
et  incertains  et  pourtant  le  veritable 
succbs  repose  pour  beaucoup  sur  cesaspects 
trop  souvent  negliges.  L'utilisateur 
croyait  pouvoir  obtenir  un  produit  cie  en 
aain;  il  a  ete  surpris  de  constater  que 
c'etsit  4  lui  qu’il  incoabait  -.de  nourrir 
I'enyeloppe  qui  lui  etait  remise. 


ninsi  I'histoire  de  la  TBD  a  perais  de 
mettre  en  evidence  un  certain  noabre  de 
point's  qu'il  faudra  desormais  avoir  soin 
de  garder  4  l ‘esprit,  par  exeaple  le  fait 
qu'il  ne  faut  pas  placer  trop  haut  les 
objectifs  -et  qu'il  faut  si  possible  choisir 
des  domaines  d 'application  bien  liaites  et 
circonscrits,  et  ne  pas  deaander  4  un 
systeae  de  traduire  n'iaporte  quel 
document . 

La  reussite  de  TRUK  METEQ,  tout  4  fait 
operationnel  et  rentable,  en  temoigne 
CcoOt:  0,03  dollar  canadien  par  mot,  pour 
un  debit  de  3,S  millions  de  mots  par 
an.Cla  figure  2  aontre  qu'il  s'agit  14 
d'un  problbae  relativeaent  simple,  si 
simple  que  dans  ce  cas  on  peut 
effectiveaent  parler  de  traduction 
autoaatique,  puisqu'aucune  revision  n'est 
necessaire) . . 

La  notion  d'etaoes  distinctes 

L'histoire  nous  apprend  aussi  qu'il  faut 
separer  graaaaires  et  aecanisaes  ou 
algorithaes  d'analyse  d'une  part, 
diet lonnaires  et  outils  terminologiques 
d'autre  part,’  de  fagon  4  facititer 
Involution  en  integrant,  plus  facileaent 
les  pregres  realises  dans  chacun  de  ces 
doaaines. 

Il  faut  aussi  considerer  que  la  qualite  est 
liee  aux  developpeaents  4  plusieurs 
niveaux: 

1 . Transli t terat inn ■  verification  et 
preparation  du  texte  (tout  ce  qui  peut 
entrer  dans  la  phase  dite  de  'pre-edition' 
qui  est  l 'ensemble  des  taches  peraettant  4 
la  machine  de  savoir  reconnaltre  au  mieux 
ce  qui  lui  est  presente.. 

2.  traduction  mot  4  ant  (4  partir  de 
dictionnaires  plus  ou  aoins  evoluec, 
pouvant  alter  jusqu'4  une  'navigation’  dans 
une  base  de- connaissances  terminologiques 
organisee,  de  type  thesaurus  de 
descripteurs  par  exeaple  (base  qui  peut 
etre  organisee  soit  a  priori  soit  4  partir 
du  corpus  entre). 

3. analvse  svntaxiaue  (arbres  syntaxiques 
peraettant  d’aller  au-del4  du  simple  mot  4 
mot,  en  identifiant  sujet,  verbe, 
complements. . .C’est  le  niveau  oO  l 'on  sait 
reconnaltre  la  construction  de  la  phrase, 
independaaaent  de  la  reconnaissance  de  son 
contenu  informat ionnel . 

4.  Le  quatrieme  niveau,  qui  vient 
s'ajouter  4  l 'analyse  syntaxique,  est  celui 
de  I'analvse  sbaantiaue.  'He  is  a  gas'  ne 
devient  comprehensible  que  si  He  est 
rapproche  de  helium.  En  .japonais  notamment, 
ob  I’ordre  des  mots  n'est  pas  rigide  comae 
en  anglais,  une  analyse  purement 
grammatical  laisse  subsister  bon  nombre 
d'ambiguites  (IV. 


5.  Enfin  et  surtout  la  qualite  n’a  de 
chances  d'etre  atteinte  que  si  I'on  dispose 
d'indicateurs  de  contexte.  ce  qui  suppose 
que  la  machine  ait  une  connaissance  du 
aonde  exterieur,  une  certaine  faculte  de 
raisonneaent  4  partir  des  faits  ou  des 
donnees  qui  lui. ont  ete  fournis.  Une 
bonne  part  des  vicissitudes  de  la  TRO 
vient  du  fait  que  jusqu.'ici  ces  indicateurs 
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de  contexte  fitaient  pratiquement 
inexistants,  et  le  grand  progrfts  viendra  du 
recoups  aujourd'hui  possible  a 
l 'ir'  diligence  artif icietle. 

Rapprochement  avec  les  svstemes  experts. 

La  tendance  serait  a  organiser  une 
communication  entre  les  mdcanismes  de 
traduction  et  la  base  de  connaissances 
linguistiques,  cette  communication  btant 
gbrbe  par  le  K0M5  (knowledge  base 
management  system),  cette  gestion 
impliquant  un  retour  d’information  servant 
a  l 'accroissement  et  a  l  'amelioration,  des 
connaissances  au  cours  de  la. vie  du  systame 
(2),  autrement  /dit  on  tend  a  se  rapprocher 
de  la  philosophic  des  systimes  experts: 
regies  utilisfies  par  un  moteur  d'infbrence 
et  le  recours  a  une  base  de  connaissances, 
dont  le  conteriu  est  g6r4  par  des 
cogniticien^  tirant  le  meilleur  parti 
possible  du  savoir  faire  d'experts  qui  ici 
pourraient  bien- ttre  les  traducteurs  et  les 
interpretes  de  conference,  qui  sont  les 
veritables  experts  et  dont  le  concours, 
repetons  le,  permettra  d'eviter  les  bourdes 
encore  trop  souvent  rencontrees  dans  les 
resultats  apres  des  annees  de  recherche  et 
de  developpement  I 

Impact  du  marche  sur  1 'environnement 
Perception  de  I'enleu. 

L’enjeu  est  devcnu  beaucoup  plus  important. 
It  est  surtout  mieux  pergu  et  pris  en 
compte  au  niveau  politique. 

L'investissement  dans  La  TRO  est  reconnu 
cmne  une  necessity  parce  qu'on  s'accorde 
aujourd'hui  a  reconnaltre  l 'importance  de 
l'investissement  inmateriel  a  cflte  de 
l'investissement  materiel,  ceci  non 
seulement  dans  le  secteur  scientifique  et 
technique,  mais  dans  le  monde  des  affaires, 
assurances,  banque,  tourisms, 
droit, religion,  bref  de  tout  ce  qui  touche 
a  la  culture  et  a  la  communication  entre 
les  peuples.  On  decouvre  que  la  reduction 
de  la  barriere  lingulstique  est  le  plus 
grand  defi  de  cette  fin  de  siecle. 

L 'Europe,  avec  une  grande  sagesse,  fait 
tout  pour  preserver  chaque  langue. 
•Lorsqu'une  langue  meurt,  avec  ses 
couleurs,  ses  nuances,  le  peuple  meurt 
aussi*  Olaila  Talvio  -  Finlande-  Pensfies 
Eternelles.) 

Le  babel isme  dans  le  monde. 

Or  la  realite  est  qu'il  existe  environ  3000 
langues  \  vantes  dans  le  monde,  parmi 
lesquelles  il  faut  faire  des  choix  lies 
aux  enjeux  culturels  et  plus 
prosaiquement  aux  marches  A  escompter. 

Leur  importance  relative  peut  se 
me5urer:C3) 


-scion  I'ethnle:  le  chinois  d'abord,  puis 
I'anglais  (8,61),  puis  I'hindi,  I'espagnol, 
le  russe,  le  frangais  n'etant  que  12tme 
avec  117  millions  de  personnes. 

-selon  t'effectlf  des  locuteurs:  I'anglais 
(301)  loin  devant  le  portugais  (71),  le 
russe  (61) 

-selon  le  volume  des  publications 
scientifiques  et  techniques:  I'anglais 
(plus  de  501).,  le  russe,  I'allemand,  le 


frangois  et  le  japonais  totalisant  401 
suppltmentaires 

-selon  la  production  littCraire 

-selon  la  oualltC  des  auteurs:  21  prix 
Nobel  aitribuis  A  des  ouvrages  en  angtais, 
12  .pour  le  fpangais,  9  pour  I'allemand... 


Politique  europ4ennc;  Systran  puis  Eurotra. 

L'Europe  a  renonce  A  adapter  une  langue 
unique  (angtais,  frangais  ou  esptranto). 

En  1975  la  CEE  a  acquis  Systran,  comme 
debut  de  solution  mais,  prenant  conscience 
de  ses  limitations  et  insuff isances,  elle  a 
lance  le  -programme  Eurotra  (European 
Translator).  Le  modeie  A  transfert  choisi 
pour  Eurotra  implique  que  les  modules 
d'analyse  et  de  gener*‘ion  de  chaque  langue 
soient  congus  dans  une  optique  monolingue. 
Chaque  nation  est  en  charge  de  l 'analyse  de 
sa  langue  et  du  transfert  des  autres 
langues  vers  sa  langue  (2).  On  aboutit 
ainsi  A  72  modules  de  transfert  pour  les 
neuf  langues  officielles  de  la  Cbmmunaute. 
Par  exemple  I’ftquipe  frangaise  est  chargbe, 
pour  chacune  des  huit  autres  langues,  du 
travail  indiqub  en  trait  plein-dans  les 
diagrammes  ci-dessous,  od  le  symbole  IS 
d£signe  la  structure  d’interface. 


is - ►  IS 

is _ 

— ►  IS 

t  i 

t 

1 

!  i 

i 

i 

l 

/ian$ais  autre  langue 

aulre  langue 

frangais 

Transformation  de  l 'environnement  ■ 

Une  telle  evolution  dans  la  prise  en 
compte  du  probLtme  de  la  TRO  a  pour 
consequence  de  transformer  radicalement 
l 'environnement .  Les  dCveloppements  sur 
Systran  sont  laissds  aux  utilisateurs 
tandis  qu'avec  Eurotra  on  mobilise  dans 
chaque  nation  les  Cquipes  universitaires 
les  plus  competentes  dans  l 'analyse  de  la 
langue  vernaculaire  et  sa  representation, 
ceci  tres  gtobatement  et  iniependamment  de 
besoins  particulars  qui  pourraient  nar  La 
suite  interesser  tel  ou  tel  utilisateur, 
dans  tel  ou  tel  contexte,  celui-ci  pouvant 
alors  apporter  le  complement  d'une  base 
de  connaissances  propre  A  son 
environnement  specifiquc. 

On  entre  ainsi  dans  un  univers  tout 
different.  La  TRO  devient  une  composante 
ou  un  segment  d'application  du  traitement 
et  de  I'industrie  de  la  langue,  parmi  de 
nombreuses  applications  connexes  qui  vont 
toutes  se  renforcer. 


La  floure  3  montre  que  desormais  le 
traitement  de  la  langue  constitue  de  plus 
en  plus  une  discipline-en  soi  s'exergant  au 
profit  de  la  TRO  mais  tout  aussi  bien  de  La 
communication  en  general:  reconnaissance  du 
contenu  des  textes  ou  du  discours, 
generation  des  documents  par  voie 
eiectronique  en  vue  de  leur  traitement,  que 
ce  soit  dans  le  cadre  de  l ' informatique 
documentaire ,  du  dialogue  avec  des  systemes 
experts,  de  I'etude  statistique  ou 
conceptuelle  ou  -informationnelle  de 
contenu. 

Un  exercice  interessant,  si  l 'on  veut 
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confirmation  de  ce  phenomene,  est  de  faire 
une  rapide  analyse  qui  sans  aller  jusqu'a 
l 'analyse  bibliometrique  peut  etre  la 
suivante:  a  partir  du  fichier 
bibl iographique  Inspcc  par  example  voir,  en 
utilisant  une  commande  toute  simple  comme 
*..memt*  de  questel  Plus,  ce  qu'est 
l 'environnement  semantique  dy  la  traduction 
assistee  ou  automatique  d'une  part,  du 
traitement  de  la  langue  d'autre  part:(cf. 
figure  4) . 

Pour  la  traduction  on  peut  noter  la  place 
importante  des  dictionnaires,  puis  ...  des 
politiques  gouvernementales,  ce  qui  est 
le  signe  de  la  prise  de  conscience  dont  on 
parlait  plus  haut,  puis  des  5GBQ,  des 
bases  de  connaissances ,  des  applications 
de  la  microinformutique,  du  traitement  de 
texte,  jusqu'a  l  'edition  assistee  et  les 
systtmes  experts. 

Oans  le  cas  aO  le  traitement  de  la  langue 
est  pris  comme  point  focal,  on  trouve  les 
dictionnaires,  le  traitement  de  texte,  la 
formation  assistee  par  ordinateur, 
l 'edition  e lectronique ,  l 1  indexation,  les 
applications  de  la  microinfcmatique. . . 

Regroupements  autour  du  traitement  de  l„ 
langue. 

De  m8me  que  l 'on  verra  des  equipes 
universitaires  jusqu'ici  disperstes  se 
regrouper,  pour  se  partager  les  taches  au 
lieu  de  s'ignorer  ou  de  se  concurrencer,  on 
verra  s'optrer  dans  les  entreprises  des 
regroupements  permettant  de  reunir  tout  ce 
qui  est  connexe  et  interdependant,  qu'il 
s'agisse  par  exemple  de  l '  integration  de  111 
chatne  de  production  des  documents ,  depuis 
l 'aide  a  la  redaction  jusqu'a  la  diffusion 
en  passant  par  la  traduction,  par  la 
normalisation  et  bien  entendu  par  des 
regroupements  de  taches,  de  competences  et 
de  metiers. 

(.'Industrie  de  la  langue  ou  le  traitement 
de  la  langue  apparatt  veritablement  comme 
un  nouveau  paradigme.  G.Oosi  a  defini  le 
paradigme  technologique  comme  un  ensemble 
de  probiemes,  de  procedures  et  de  taches 
lies  au  developpement  technologique,  dans 
lequel  les  forces  du  marche  et  la  demande 
vont  agir  comme  un  mecanisme  de  selection 
(S).  une  fois  qu'une  voie  de  changement 
technique  a  ete  creee,  celle-ci  a  une 
dynamique  propre,  qui  derinit  les 
directions  dans  lesquelles  I'activite  de 
resolution  du  probieme  se  deplace.  On  passe 
ainsi  desormais  d'une  conception  systemioue 
(la  TOO)  a  un  ensemble  de  besoins  de 


fonctionnalites  (  placees  dans 
l 'environnement  du  traitement  de  la  langue 
et  finissant  par  s'integrer) . (6) 

Consolidation  et  importance  de  I'industrie 
de  la  lanoue: 

Le  marche  va  done  se  trouver  consolide 
sous  I'effet  de  plusieurs  facteurs  lies  aux 
progres  technoLogiques  et  aux  avancees  du 
genie  logiciel  et  des  linguiciels  TRO  mais 
aussi  en  amont  et  en  aval  de  la  TRO. 

par  exemple  le  secteur  de  l 'edition,  ou 
tout  au  moins  ceux  des  editeurs  qui  se  sont 
engages  dans  la  meme  vole,  vont  beneficier 
de  la  possibilite  de  rayonnement  accrue 
qu'apporte  la  TRO  integree  a  une  chalne 


d'edition.  oes  a  present  les  brevets 
.japonals  par  exemple  ne  sont  .plus  un  champ 
clos.  Par  le  truchement  des  banques  de 
donnees  bibliographiques,  lorsque  l 'edition 
n'a  ete  prevue  que  dans  une  seule  langue, 
l ' information  sur  1‘existence  et,  le 
contenu  des  documents  publies-dans  cette 
lanque  se  trouve  accessible: par  des 
utilisateurs  d'autres  langues,  soit  que  les 
producteurs  de  ces  banques  de  donnees  aient 
entre  par  exemple  un  thesaurus 
multilingue,  comme  dans  le  cas  du  fichier 
Pascal  de  I'INIST  (specimen  fio.  5).  qui  se 
trouve  indexe  en  trois  langues  et  ainsi 
accessible  lorsque  les  questions  sont 
posees  en  frangais,  en. anglais  ou  en 
espagnol,  soit  que  le  serveur  ait  acquis 
un  logiciel  qui  permet  cette  transposition 
d'une  langue  vers  l 'autre  meme  si  le  texte 
n'a  pas  ete  prealablement  indexe  par  des 
mots-cies.  Cette  transposition  pourra  etre 
integrate  ou  partlelle,  accompagnee  de  la 
mise  en  evidence  du  contenu  essentiel  du 
document,  ou  pourra  focaliser  sur  les 
aspects  en  relation  avec  I'interet  ou  le 
■profit*  de  l ‘utilisateur,  de  sorte  que 
d'un  meme  coup  l 'on  va  pouvoir  detecter 
l ‘information  utile  de  maniere  beaucoup 
plus  fine  que  par  les  seuls  opera teurs 
logiques  appliques  brutalement  entre  des 
mots,  ceci  en  passant  par  des  analyseurs  de 
texte  utiles  aussi  bien  pour  La 
comprehension  de  la  question  et 
eventuellement  un  dialogue  avec 
l 'utilisateur,  que  pour  la  selection  et  les 
transpositions  utiles  dans  la  langue  de 
l 'utilisateur. 

D’une  maniere  generate  les  banques  de 
donnees  en  ligne  constituent  une  source 
importante  d'information  linguistique  et 
une  aide  a  la  traduction,  ainsi  que 
I'explique  Hikomaro  5ano  (19). 

Une  autre  possibilite  est  de  placer,  au 
niveau  de  l 'utilisateur,  un  dispositif 
permettant  de  transferer  les  resultats 
d'une  interrogation  en  ligne  sur  un  serveur 
de  traductions,  du  type  Systran,  et 
recueillir  une  traduction  brute  qui,  dans 
certains  cas,  pourra  se  passer  de 
post-edition  (par  exemple  s'il  s’agit  de 
parcourir  titres  et  resumes  signaietiques 
de  documents  dont  on  veut  verifier  le 
contenu  avant  de  les  commander  ou  de  les 
faire  traduire. 

Dn  peut  aussi  imaginer  la  traduction  par 
machine  a  priori  de  l 'ensemble  de  la  base 
bibliographique  et  de  ses  mises  a  jour. 

Heme  si  cette  traduction  est  imparfaite,  le 
specialiste  qui  lira  les  titres  et  resumes 
n'aura  pas  trop  de  peine  a  apporter, 
presque  inconsciemment ,  les  corrections 
utiles.  Personnellement  j'ai  pu  constater 
qu'un  resume  en  allemand  traduit  par 
machine  en  frangais  permet  d'apprehender 
le  contenu  de  maniere  suffisante  pour 
determiner  si  le  document  meritera  ou  non 
d'etre  commande  pour  etre  traduit. 

On  voit  done  que  e'est  tout  le  marche  de 
l ' informat  ion  en  ligne  qui  va  ainsi 
pouvoir  s'agreger  au  marche  de  la  TRO.  Or 
le  chiffre  d'affaires  des  services  en 
ligne  est  deja  de  14  8  7  milliards(7l%) 
aux  Etats-Unis,  de  t  1  a  1,5  milliards 
(18%)  en  Europe,  to, 5  milliard  (10%)  au 
Japon...  et  de  moins  de  t  0,05  milliard 
(1%)  dans  le  reste  du  monde. 

Oes  banques  de  donnees  jusqu'ici 
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totaleaent  herattiques  oude  peu  d'intertt 
tconoaique  voient  leur  facteur  d'iapact 
augment*,  uinsi  qu’est  renforct  l ‘impact 
des  produits  ou  services  et  de  la  cuLture 
qu’elles  vbhiculent. 

Paral lileaent  aux  systeaes  serveurs  de 
traduction  utilisables  aussi  bien  par  le 
grand  public,  par  ainitel,  que  par  des 
organisations  dotdes  de  aoyens  perforaants 
sptcialeaent  studies,  on  a  vu  se 
dtvelopper,  avec  la  aini  et  la 
aicro-inforaatique  le  aarr.h*  de  petits 
systiaes  qui  peuvent  etre  tris  efflcaces 
s'ils  travaillent  dans  un  doaaine  bien 
circonscrit,  avec  un  vocabulaire  bien 
aaltris*,  adae  si  c'est  avec  une  syntaxe 
excessiveaent  simple.  C'est  le  cas  de 
ALPS,  de  Macrocat  (Siedner)  ou  de  Bravice 
(Japon).  Leur  succts  s'explique  aussi  par’ 
le  fait  qu'il  existe  dans  ies 
organisations  une  information 
confidentielle  qui  ne  peut  etre  envoyte  sur 
un  serveur  extbrieur.  II  faudra  done  que 
les  serveurs  de  TflO  pensent  a  des  versions 
aicro  de  leurs  logiciels,  tout  coaae 
Questel  a  produit  aicro-Questel ,  versions  a 
inplanter  dans  les  entreprises,  si  ces 
serveurs  detraduction  ne  veulent  pas 
s'exposer  a  perdre  une  part  de  aarch*. 

□u  point  de  vue  des  systtaes,  la 
repartition  en  Europe  etait  rtceaaent  la 
suivante:  Logos(Z61),  »eidner(Z31) , 
EricssonC 161) ,  5ystran(131) ,  flips  (121)  - 
(S) 

La  CEE  consacre  une  part  importante  be  son 
budget  a  la  traduction,  soit  environ  1 
Billiard  de  francs  par  an,  et  elle  occupe 
1800  traducteurs  repartis  entre  Bruxelles 
et  Luxenbourg.  Son  choix  en  favour  de 
Systran,  dont  elle  a  en  grande  partie 
finance  le  developpeaent  (V  MECUS  de  1977 
•i  198Z)  a  ete  contest*,  aais  une 
intelligibilite  suf f lsante  de  la  traduction 
brute  a  tout  de  a8ae  ete  atteinte,  tout 
coaae  a  l ‘OTflM  (specimens  in  fine).  Depuis 
lors  elle  investit  dans  des  recherches 
propres  (EURDTRR) ,  tout  en  restant  un  des 
principaux  utilisateurs  de  Systran  C6) .  Le 
aarche  aondial  est  estiae  a  3  milliards  de 
dollars  par  an  representant  150  Billions  de 
pages  et  occupant  175000. personnes . 

Certains  disent  que  ces  chiffres  sont  bien 
en  dega  riu  aarche  potvntiel  qui  apparaltra 
lorsque  des  systeaes  plus  conviviaux  et 
plus  perforaants  seront  prets.  Voujours 
est-il  que  la  deaande  devrait  s’accroltre 
de  501  en  cinq  ans,  et  que  la  TflO  devrait 
occuper  rapideaent  5  a  151  de  ce  aarche. 

Ces  chiffres  sont  difficiles  a  verifier.  On 
ne  peut  que  faire  des  recoupeaents  ejitre 
renseigneaents  de  diverses  provenances. 

En  aatibre  de  recherche,  le  Japon  a  prevu 
un  gigantesque  effort  national  en  faveur 
de  la  TflO  oO  devrait  si'engloutir,  dans  les 
1Z  prochaines  anntes  un  budget  comparable  a 
celui  de  I'ICOT  pour  les  ordinateurs  de 
5*  generation.  Deux  iaportants  prograaaes 
de  recherche  de  plusieurs  milliards  de 
francs  ont  deaarre:  l 'un  pour  la  creation 
de  l 'Electronic  Dictionary  Research 
Institute  (1,5  milliards  de  francs,  avec 
huit  industrials,  l 'autre  a  l 'initiative  du 
Ministere  des  Postes,  pour  la  aise  au  point 
d’un  telephone  traducteur  14' Billiards  de 
francs).  En  1985  on  recensait  de ja  all  Japon 
18  projets-de  TflO,  et  plusieurs  systeaes  de 
Z*  generation,  avec  approche  seaantique  de 
aodeies  de  langage  sont  d*j4 


commercialises.  La  societe  NOVA' propose 
notaaaent  pour  le  couple  anglais  japonais 
une  station  de  travail  qui  traduit  en  une 
heure  50  pages,  soit  ZOOOO  aots.  Le  systftme 
vendu  au  prix  de  5,S5  millions  de  yens 
pourrait  etre  vendu  a  ZOO  exeaplaires  en 
1989.  RTLflS  2  de  Fujitsu  et  Pivot  de  NEC 
peuvent  traduire  jusqu'a  60000  aots  a 
l ’heure. 

Ceci  n'eapeche  pas  le  Japon  de  travailler 
aussi  sur  Systran  et  aeae  d'obtenir 
d '  iaportants  contrats  du  gouvernement 
aaericain  pour  la  traduction 
Japonais-anglais.  Systran,'  traduirait  dans 
ce  couple  1,2  million  de  aots  (6000  pages 
de  format  A4)  en  une  heure,  avec  une 
precision  de  plus  de  851.  Cette  decision 
du  gouvernement  aaericain’  est  destinee  a 
aaeiiorer  le  desequilibre  des  echanges 
d'infornation  entre  l,es  Etats-Unis  et  le 
Japon.  Nous  savons  aussi  que  l 'university 
d'Edimbourg  coopere  avec  le  Japon  sur  le 
traitement  de  la  langue  par.iee,  de  meme  que 
des  chercheurs  frangais  apportent  un 
concours  dans  une  direction  tres  voisine, 
celle  de  l 'interpretation  par  machine  (7). 
C'est  ici  que  les  progres  les  plus 
spectacutaires  sont  8  attendre,  avec  le 
developpeaent  des  machines  a  *  dictee 
aaglque*  et  l 'analyse  des  phonemes. 

Avec  le  developpeaent  du  telephone  dans 
les  phases  RNIS  (ISON)  et  POST -RN I 5  et 
en  paralieLe  les  reseaux  neuromimet igues  et 
les  machines  connexionnistes ,  on  va  se 
trouver  dans  un-environnement  informatique 
et  tetematique  particul ierement  adapte 
enfin  au  traitement  de  la  langue  ecrite  et 
pariee. 

Cette  perspective  doit  etre  prise  en 
coapte  dans  toute  evaluation  de  la 
croissance  du  aarche  de  la  TflO. 

fl  l 'autre  extrCmite  de  t'eventail  du 
aarche,  et  beaucoup  plus  modestement,  il  y 
a  place  pour  des  aides  simples  et 
portables-,  destinfres  a  certaines 
applications,  par  rjtemple  Chez  les 
allitaires  et  les.pilotes  en  particulier, 
dans  un  cadre  d'ihteroperabilite  qui  doit 
exister  sans  acroisseaent  du 'stress  auquel 
sont  deja  soumis  les  personnels.  Tr8s 
modestement,  SRNYO  propose  un  dictionnaire 
eiectronique  portable  anglais- japonais  de 
35000  aots  destine  aux  etudiants  et  aux 
hoaaes-d'affaires.  Les  possibilites  deja 
offertes  par  les  disques  compacts  CD-ROMs 
et  les  logiciels  hypertexte  et  multimedia 
viennent  renforcer  la  probabilite 
d'ectosion  diapplications  tres  diverses,  et 
done  il  ne  faut  pas  ignorer  cette  part  de 
aarche,  et  etre  attentif  b  ces  niches  ou 
crenesux  associes  au  developpeaent  de  la 
synthese  de  la  parole,  et  oO  il  y  a  place 
pour  la  traduction  :  aides  aux 
handicapes,  assistance  aux  operateurs, 
jeux  el ect roniques ,  traducteurs  de 
poche,  par  ordinateur,  notamaent 
enseigneaent  des  langues,  renseigneaents 
teiephoniques  et  aessageries  vocales,  , 
contrBle  des  taches,  alarmes  vocales, 
ctaviers  vocaux,  annonces  partees  (maree  et 
meteo,  horaires  des  trains  ou  avions, 
stations  d'autobus,  synthttiseurs-dc 
trafic)  ou  autres  applications  du 
traitement  de  la  parole  qui  a  fait  ibier. 
des  progres  ,  (machines  a 'dieter  ou 
cammande  vocale  de  Crouzet  pour  le 
Raf  al?>  - 

Si  ies  regroupemeJits  se  font  entre 
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donalnes  d'application  voisins,  le  March* 
rit  la-traduction  assistee  devrait  croitre 
tr*s  vite.  nctuettement  la  dispersion  des 
outils* p*riph*riques  (de.  traitemeiit  de 
texte.ou  d'tdition,  vtrif icateurs 
orthographiques ,  lecteurs  ou  nua*riseurs) 
rend  difficile  l 'evaluation  de  la 
progression  du  March*. 

Par  exenple  lemarch*  Mondial  des 
industries  de  la  langue  parl*e, 

March*  qui  va  lui-Mtac  interftrer  avec 
celui  de  la  langue  ecrite,  *tait  ev'alu*  * 

14  Millions  de  francs  en  1384  et  il 
devrait  atteindre  28  Milliards  d*s  1330  tun 
tiers  pour  la  synthise  de  la  parole,  deux 
tiers  pour  la  reconnaissance  verbale)  soit 
deux  Mille  fois  plus  en  six  ans _ 

Rinsi  si  le  March*  de  la  TRO  n'a  pas 
vralMent  d*coll*  en  Europe  et  en  flM*rique 
du  Nord,  on  peut  s’attendre  3  une 
explosion  de  l 'ensemble  du  March* 
del ‘  Industrie  dela  langue,  explosion 
contrM*e  par  les  Japonais,  3  Moins 
qu 'apparaisse  le  programme  cadre  europ*en 
attendu  coMMe  rtplique  3  la  strategic 
Internationale  des  japonais.  Un 
repr*sentant  de  la  CoMMission  aurait 
indiqu*  3  Munich  deux  hypotheses  pour  le 
prograMMe  LIFE  (Industries  de  la  langue  en 
Europe)  de  150  mff  3  2  Milliards  df  «  ) 

EnvlronneMent  technique. proprenent  d.>. 

Ce  tour  d'horizon  general  sur  l ‘evolution  a 
Montr*  que  la  tho  est  aujourd'hui  sortie  de 
son  isolenent,  qu'elle  doit  passer  du  stade 
du  laboratoire  3  1 ' industrialisation  pour 
se  placer  dans  un  environneMent 
op*rationnel ,  au  me*r  titre  que  d’autres 
applications  du  traiteMent  de  -la  langue 
naturelle,  et'.  qu'elle  doit  pouvoir 
s’integrer  dans  la  .chalne  de  traiteMent 
docuMentaire. 

II  faut  rappeler  que  l ‘environneMent 
technologique  s'est  f  ondamentalement. 
transform*  depuis  les  debuts  de  la  TRO.  II 
Me  seable  que  seul  Peter  Toaa,  le  p*re  de 
Systran,  avait  3  I'epoque  une  vision  de  ce 
que  serait  cette  evolution:  possibilites 
accrues  des  m*moires  centrales  et  surtout  - 
des  a*Moires  p*riph*riques  3' des  prix  de 
plus  en  plus  faibles,  evolut ion-des 
langages  de-  programmation,  apparition  de 
r*seaux  f tables  et  3  large  bande  pour  la 
transmission  des  donn*es,  generalisation  de 
stations.de  travail  bureautique  et  Mini  ou 
aicroinforMatique  pouvant  appeler  des 
serveurs  de  dlctionnaires  electroniques  , 
des  serveurs  de  banques  de  donn*es 
textuelles  et  des  serveurs  de  traduction, 
et  enfin  apparition  de  l ‘intelligence 
artificielle  ;•  cette  evolution 
technologiqur  rend  tout  3  fait  plausible 
une  percee  p. ochaine  importance  dans  le 
secteur  de  la  TRO  ou  de  l 'interpretation 
assistee  par  urdinateur,  coMpte  tenu  des 
progres  realises  dans  les  .secteurs  connexes 
de  l 'Industrie  de  lo  langue: 
reconna'issanse  de  U  parole  et  nuM*risation 
des  phonemes,  lecture  optique. .etc. 


■L'integratton  dans  ta.Chalne  de 
traiteMent  'dor.uaentaire  peut. prendre 
plusieurs  formes:  dans  le  cas 
relativeiceiit  bien  circonscrit  de  la 
documentation  technique,  par  exeMple  chez 
les  grands  constructeurs  du  secteur 


aerospatial  qui  ont  3  produire  et  3 
traduire  d’enormes  volunes  de 
docuMentatlon  accompagnant  les  materials 
(3  titre.  d'exenple  la  traduction  de  la 
docUMentation  d’un  Airbus  denande  80000 
heures  de  travail  de  traduction,  soit  39 
ann*es/hoMMe  pour  un  coQt  de  8  MFF), 
l ' integration  coMMence  -au  niveau  des 
bureaux* d'etudes  avec  la  CRO,  conception 
as:ist*e  par  ordinaleur,  ou  avec  la 
fabrication  integrfe  assistee  par 
ordinateur.  It  apparatt  de  plus  en  plus 
discutable  et  MeMe  aberrant  (8)  de  revenir 
3  un  support  papier  encombrant  et  d'interet 
limit*  alors  que  l ' inf creation  est  ou  aura 
*t*  num*ris*e  et  balis*e  par  SGML 
(standard  generalised  mark-up  language) 
dans  le  cadre  CRL5  (Computer  Hided 
Requisition  and  Logistic  Support 
Initiative),  programme  amorce  aux 
Etats-Unis  nais  d*J3  suivi  par  un  certain 
nombre  de  pays  (Eurocals).  Rinsi 
l ' information  se  trouvera  accessible  en 
ligne,  dans  la  forme  souhait*e  par 
1‘utilisateur,  et  non  plus  dans  une 
presentation  pr*determin*e  ,  unique  et 
flg*e,  qui  est  celle  du  support  papier  3 
partir  de  1‘organisMe  source,  le  mieux 
place  pour  la  g*n*rer  et  la  mettre  3 
jour,  directement  ou  par  l  ‘  intermedia! re 
d’une  passerelle  (gateway).  II  est  evident 
que  non  seulement  on  arrivera  ainsi  3  des 
economies  substantielles  mais  qu'on 
disposera  3  tout  moment  d'une  information  3 
Jour  et  eventuelleaent  de  sa  traduction  3 
jour  dans  La  Langue  des  principaux  pays 
clients.  Cette  perspective  n'est  pas 
lointaine  dans  la  mesure  ou  Les  analyseurs 
de  texte  et  autres  outits  linguistiques 
utilises  pour  la  TRO  seront  de  toute  fagon 
egalement  utiles  pour  toutes  les  interfaces 
tel les  que  interrogation  de  banques  de 
donnees  multilingues,.  .passerelles  pour  y 
acceder,  systemes  experts  et  bases  de 
connaissances  associ*es,  et  autres 
applications  du  traitement  de  la  langue 
telles  que  exploration  rapide  (skimming)  et 
routage  systematique  vers  les 
utilisateurs.(9) 

Site  d'imolantation  de  la  TRO. 

Dans  la  majorit*  des  cas  son  site 
d ' implantation  ideal  et  evident  pour  lui 
donner  les  conditions  d'environnement  les 
plus  favorables  sera  le  service 
d' information  ou  de  documentation  qui 
existe  a  divers  degr*s  de  d*veloppenent 
dans  toute  organisation,  on  trouve  en  effet 
d*J3  dans  I'activit*  du  service  de 
documentation,  qu’il  ait  ou  non  une  mission 
precise  de  traduction,  toutes  Les  facettes 
que  l 'on  a  signalers  dans  l 'environnement 
TRO  puisque  celui-ci  a  en  charge  la 
generation,  la  collecte,  I’archivage,  Le 
treiteaent,  la  selection  et  la  diffusion 
des  documents  et  surtout  la  gestion  de 
leur  contenu  inf oraationnel  en  passant  par 
1‘analyse  ct  l ‘indexation,  qui  sont 
egalement  assistes  par  ordinateur.  On  y 
trouve  d*J3  n*cessairemer,t  du  personnel 
linguiste,  puisque  l 'information  traitee 
est  en  plusieurs  langues,  personnel 
specialise  de  surcroit  dans  les  techniques 
touchant  la  couverture  du  centre.  Ce 
personnel  a  d*J3  une  longue  experience  des 
probieaes  de  s*aant;ique.  II  a  cr**  et 
utilise  des  lexiques-et  thesaurus 
aonolingues  ou. Multilingues,  il  interroge 
des  banques  de  donnees  terminologiques,  il 
developpe  et  utilise  des  logiciels 
d'analyse  l inguisiiquer dans  le  cadre  de 
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la  recherche  documhntaire  ou 
bibl iometrique ,  il  nanipule  dr ja  lecteurs 
optiques  ou  numerisuurs,  ou  traitement  de 
texte  pour  saisir  ’.-‘information  collectee. 
C'est  la  sans  aucundoute  qu'il  faudra 
renforcer  eventual lement  les  equipes  el 
les  moyens  et  surtout  eviter  de  les 
dupliquer  en  les  \implantant  ailleurs. 

Les  etaoes  du  traltement: 

Generation,  collecte  et  saisie  du  texte, 
et  pretraitements  associes  a  cete  phase: 

Si  l-'on  veut  aller  vers  t  ’ industrial isation 
il  faut  favoriser  et  encourager  la 
production  de  texte  numerise,  qu'il 
s'agisse  de  traltement  de  texte  ou  d‘ 
edition  electronique  ou  de  toute 
transaction  permettant  d'avoir  au  point  de 
depart  une  representation  du  texte  qui 
evite  les  saisies  onereuses  et  peu  fiables 
par  lesquelles  on  avalt  a  passer  naguere. 

Hinsi  le  texte  peut  provenir  d'une  bande 
magnetique  utilisee  dans  la  composition 
programmer  puis -debarassee  des  signes  de 
composition,  ou  d’un  teiedechargement ,  ou 
d’une  nessagerie  electronique.  Ce  n'est  que 
dans  les  cas  oG  l 'on  ne  disposers  d'aucun 
support  non  impriae  qu’il  faudra  se 
resoudre  a  passer  par  un  lecteur  optique  ou 
par  un  numeriseur  capable  d'utiliser  des 
algorithmes  de  reconnaissance  des  images  et 
des  caracteres.  Des  la  generation  du  texte 
il  convient  d'utiliser  au  maximum  toutes 
les  ressources  que  la  bureautique  peut 
apporter . 

Sans  alter  jusqu’aux  contrainte*  qui  ont 
ete  acceptees  dans  le  systeme  TITUS 
(figure  6)  developoe  oar  I’lnstitut 
Textile  de  France,  contraintes  qu’on  ne 
peut  inaginer  que  dans  un  environnement 
t'otalement  coritrflie,  on  peut  se  servir 

d'outils  et  de  logiciels'  tels  que  ceux  que 
propose  la  societe  Microsoft;  A  partir  d'un 
CO  ROM  dote  des  fonctions  suivantes: 

-dictionnaire  de200000  termesCRmerican 
Heritage) 

-dictionnaire  des  synonymes  Roget's 
-citations -de  Bartlett's  familiar 
quotations 

-World  almanach  of  books  and  facts 
-ouvrage  de  rererence  sur  I'art  d'ecrire 
(Chicago  manual  of  style) 

-correcteur  orthographique  fonctionnant  sur 
un  algorithme  phonetique 
-correcteur  d’usage  (en  fonction  du 
contexte) 

-formulaire  et  lettres-type. .. 

Tres  vite  on  entre  ainsi  dans  un  processus 
de  pre-traitement  et  dans  le  domaine  des 
linguiciels  ofl  des  aides  diverses  existent. 
Par  exemple  les  travaux  de  Janine  GaMais 
Hamonno  ont  montre  que  la  syntaxe  utihisee 
par  les  differentes  professions  ou 
communautes  scientif iques  varie  en  fonction 
des  specialites.  Par  exemple  les  outils 
LIOIR  (10),  paramttris  en  frangais-et  en 
anglais,  permettent  d’ameiiorer  le  texte 
dans  une  speciality.  L'objectif  est  de 
traduire  comme  si  le  texte  avait  tli  ecrit 
par  un  spfcialiste  de  la  langue  cible, 
c'est-O-dire  de  ne  plus  traduire  le  texte 
tel  quel  mais  de  le  modifier  pour  tenir 
compte  des  modes  de  penste,  de  culture>,et 
d'expression  de  la  langue  cible.  Toujours 
dans  le  mSme  environnement,  RNRGOGE  permet 


de  constituer  autometiquement  des 
dictionnaires  de  concepts,  de  regrouper 
les  champs  semantiques  associes  par  les 
specialistes  A  chacun  des  concepts, 
d 'analyser  les  trames  phetoriques  des 
textes  entres  et  de  constituir  des 
bibliotheques  de  trames  dont  chacune 
correspond  A  un  mode  de  presentation  d'un 
document  ou  A  un  type  d 'argumentation ,  de 
connattre  les  concepts  utilises  dans  les 
text'es  et  done  de  pi-cndre  leur  traduction 
dans  un-dictionnaire  d 'equivalences  qui 
evite  les  erreur.s  rencontrees  lorsqu’on  ne 
s'interesse  qu'aux  occurences  de  mots  et 
non  aux  concepts. 

□ans  le  meme  urdre  d'idee  HIERRRCKIE  permet 
d'extraire  les  concepts,  d'analyser  leur 
hierarchic  et  de  creer  automatiquement  des 
thesaurus  de  termes  ou  expressions 
designant  les  concepts.  Il  fonctionne  pour 
le  frangais  et  l 'anglais  et  peut  etre 
utilise  pour  l 'indexation  automatique  des 
textes  en  anglais  ou  en  frangais, 
l 'alimentation  automatique  d'une  base  de 
donnees,  le  routage  des  messages,  I'analyse 
automatique  des  traductions  (reperage 
automatique  des  erreurs,  ceci  apres-un 
parametrage  sur  quelques  centaines  de  pages 
de  texte  d'une  specialite.  Ceci  implique 
que  la  traduction  porte  sur  des  documents 
d'un  volume  relativement  important, 
constituts  en  series  si  possible.  Ces 
logiciels  sont  en  cours  d'adaptation  au 
russe  et  au  japonais. 

La  reconnaissance  de  format,  ou  la  mise  au 
format,  fait  partie  aussi  de  cette  etape 
pretiminaire  do  ‘pre-edition".  Par 
exemple,  lorsqu’il  s’agit  de  traduire  des 
notices  bibliographiques  provenant  de 
banques  de  donnees,  on  pourra  choisir  de  ne 
prendre  en  compte  pour  la  traduction  que  le 
champ  titre,  le  champ  resume  et  le  champ 
mots-cies,  done  de  reconnaltre  les  autres 
champs  afin  de  les  ignorer  momentanement . 

La  translitteration  est  aussi  une  operation 
amont  qui  peut  etre  entidrement 
automatique. 

□ans  les  autres  points  de  detail,  la 
preparation  du  texte  en  vue  de  sa 
reconnaissance  optimale  peut  etre  quidde 
par  le  systeme,  qui  posera  des  questions 
sur  ce  qui  lui  paralt  ambigu  ou  sur  ce 
qu’il  ne  salt  pas  interpreter  en  premiere 
lecture.  Il  faudra  apporter  les  marques 
particulieres  qui  renseignent  sur  chacun  de 
ces  points,  ou  indiquer  que  l 'on  peut 
ignorer  tel  ou  tel  obstacle. 

Cette  operation  sera  peut-  etre 
fastidieuse  pour  un  traducteur,  alors  que 
celui-ci  sera  en  revanche  indispensable 
dans  la  post-edition.  Neanmoins  ce  travail 
doit  etre  confie  A  un  personnel  ayant  une 
certaine  connaissance  de  la  langue  source 
et  de  la  langue  cible,  et  qui  soit  capable 
de  conduire  une  fonction  d'enriettissement 
ou  d'apprentissage  pour  tout  ce  qui  a  un 
caractere  recurrent,  ceci  en  mode 
interactif  si  possible. 


Terminolooie ■ 

L'organisation  des  ressources  en 
terminologie  conditionne  la  qualite  des 
resultats.  Dans  le  secteur  d'activite  ou  de 
specialite  de  l 'utilisateur  quel  qu'il  soit 
on  trouve  des  termes  gdneraux,  const ituant 
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un  vocabulaire  livre  en  general  par  le 
fournisseur  du  systeae,  par  example  Robert 
&  Collins  pour  aiedner,  et  des 
dictionnaires  sectoriels  dont  certains 
peuvent  aussi  ttre  llvres  avec  le  systeae. 
On  affectera  une  priority  a  tel  au  tel 
dictionnaire  sectoriel  en  fonction  du 
contend  du  document  trait*.  Hais  cette 
ressource  terainologique  est  insuf f isante . 
II  deaeure  que  l ’utilisateur  doit  faire  un 
effort  essez  important  s'il  veut  aaltriser 
convenableaent  la  seaantique  pour  atteindre 
un  niveau  suffisant  d '  intelt igibi l i te  de  la 
traduction  brute  et  facillter  la 
post-editiun.  Tous  ceux  qui  sent  parvenus  a 
des  rdsultats  tangibles  en  aatiere  de  TRO 
ont  coapris  qu'll  fallait  prealableaent 
passer  par  cet  effort  qui  constitue  un 
investisseaent .  C'est  ainsi  qu'ont  procede 
les  Coaaunautes  Europfennes,  l 'Aerospatiale 
et  l 'Institut  Textile  de  France  par 
exeaple. 

II  est  interessant  d'exaainer  de  plus  pres 
ce  qui  se  passe  4  l 'nerospat iale  (15). 
L'enseable  des  activites  de  terainologie 
est  soigneuseaent  coordonne  selon  le  schema 
ci-apres  qui  constitue  le  veritable  reseau 
terainologique  de  la  societe,  et  dont  le 
coordonnateur  est  en  interaction  avec  les 
banques  de  donnees  de  terainologie  et  les 
dictionnaires  produits  par  la  societe. 
L'effort  dans  ce  doaaine  d'activite  est 
reaarquable: 

1971  dictionnaire  fr angais-anglais  IflOOO 
teraes 

1978  dictionnaire  trilingue  frangais 
anglais  alleaand. 25000  teraes 

1984  dictionnaire  quadrilingue  frangais 
anglais  alleaand  espagnol  50000  teraes 

et  ainsi  de  suite  avec  un  dictionnaire  des 
abreviations,  un  dictionnaire  des 
definitions,  et  I'acces  4  une  banque  de 
terainologie  interne,  toutes  ces  sources 
etant  bien  entendu  nuaerisees,  et  4  des 
banques  de  terainologie  exterteures, 
nationales,  europeennes  cu  Internationales. 

C'est  d'ailleurs  probableaent  gr4ce  4  cet 
investisseaent  dans  la  terainologie  que 
l 'Aerospatiale  s'est  trouvee  en  position 
tr4s  favorable  pour  negocier  avec  la  CCE 
des  aars  1982  l 'utilisation  de  Systran 
coapte  tenu  de  son  apport  d’un  dictionnaire 
quadrilingue  couvrant  bien  le  secteur 
aerospatlal . 

C'est  dans  ce  contexte  que  l 'Aerospatiale 
peut  afficher  des  coots  globaux  de  0.45FF 
/mot  sur  les  couples  frangais  anglais  et 
anglais  frangais,  coOts  calcuies  sur 
l'enseable  des  traiteaents:  preparation  du 
texte  et  lecture  optique,  traduction  brute 
et  post  edition  affinee.  On  trauve  dans 
ce  aCae-article  de  l 'Aerospatiale  (15) 
des  indications  detainees  sur  les  coOts. 

Lursque  les  dictionnaires  sont  realises  en 
interne,  le  resultat  a  toutes  les  chances 
d'etre  bien  aeilleur,  surtout  si  ce  travail 
est  pilote  par  le  service  d'inforaation  ou 
de  documentation  oCl  l 'on  a  depuis  longteaps 
l 'habitude  de  raisonner  sur  les  concepts 
et  non  sur  les  mots  Isolds  ou  'uniteraes*. 
Tous  les  specialistes  de  ^'information 
savent  coabien  il  est  dangereux  de  sdparer 
des  teraes  en  relation  paradigaatique.  Par 
exeaple  rayonneaent  de  frelnage  est  une 


entite  tout  comae  son  equivalent 
breasstrahlung.  On  se  souvient  des  echecs 
qu'avait  connu  Taube  avec  ses  uniteraes  et 
des  succes  de  Hooers  avec  ses 
descripteurs.  L'atterrissage  sur  le  ventre 
d'un  Tupolev  ne  peut  en  aucune  aanidre,  si 
ce  point  est  bien  coapris,  devenir  dans  la 
traduction  l'atterrissage  -  sur  le  ventre 
d'un  Tupolev.  En  anglais  par  exeaple 
belly-landing  forme  un  tout,  alors  que  le 
frangais  peraet  ce  type  d'erreur  si  I'on 
n'a  pas  raisonne  au  niveau  conceptuel. 

La  traduction  orooreaent  dite. 

Les  systeaes  dits  de  14re  generation 
utilisaient  la  aethode  directe  de  passage 
de  la  langue  source  4  la  langue  cible, 
c'est-4-dire  la  traduction  aot  4  mot, 
suivie  par  une  procedure  visant  4 
rearranger  les  aots  dans  la  phrase  en 
utilisant  des  regies  de  reconstruction  pour 
aboutir  4  des  phrases  acceptables,  en 
dehors  de  toute  couprehension  du  contenu. 

On  ne  peut  bien  entendu  trouver  des  regies 
suffisaaaent  generates  qui  soient 
appl icables  4  tous  les  types  de  contenu 
de  texte,  ou  alors  il  faut  limiter  ces 
types  de  contenu  comae  dans  Titus.  On  est 
alors  passe  4  une  approche  interlingual* 
dans  laquelle  on  s'efforcr-de  coaprendre  la 
signification  de  la  phrase  4  travers  une 
analyse  conduisant  4  une  representation 
independante  de  la  langue  cible. 

On  est  passe  ainsi  de  la  notion  de  langage 
pivot  4  la  notion  de  structure  de  transfert 
ou  d'interface  que  I'on  trouve  dans 
Eurotra.  C'est  cette  evolution  que  suit 
aussi  le  GETA  de  Grenoble  4  partir 
d'Ariane-78  et  apres  que  ce  logiciel  ait  pu 
§tre  teste  sur  un  certain  noabre- de  couples 
de  languesdG).  Aujourd'hui  done  la 
plupart  des  systeaes  se  considerent  comae 
de  Zae  ...ou  de  38ae  generation  selon 
qu'ils  font  appel  4  cette  structure  de 
transfert  et  4  l  'intelligence  artificielle 
et  aux  regies  qui  ont  pu  etre  experiaentees 
avec  succes,  mats  dans  des  espaces  liaites, 
avec  les  systeaes  experts.  Par  voie  de 
consequence  Sgaleaent  on  a  de  aoins  en 
aoins  recours  au  systeae  siapliste  que 
constituent  les  equivalences  proposees  dans 
les  dictionnaires,  et  I'on  s'oriente  vers 
une  prise  en  coapte  plus  systeaatique  de 
•descripteurs"  et  d'expressions  toutes 
faites. 

Certaines  langues  font  aujourd'hui 
t'objet  d'etudes  tr4s  poussees  visant  4  la 
representation  par  des  graphes  de  toutes 
les  significations  des  aots  dans  tous  les 
contextes  d'utilisation  en  testant  les 
resultats  sur  des  corpus  mis  4  la 
disposition  des  linguistes,  comae  le 
•Tresor  de  la  langue  frangaise* 

Le  LADL  (laboratoire  d 'autoaatique 
docuaentaire  et  de  linguistique)  du 
Professeur  Maurice  Gros  a  ais  sur  bande 
130000  foraes  verbales  du  frangais.  Son 
dictionnaire  etectronique  a  pour  objet  La 
description  de  la  langue  de  fond  en 
coable,  en  n'oubliant  aucune  expression, 
aucun  idioae.  La  langue  devient  ainsi  une 
aatiere  premiere  industrielle  utilisable 
dans  toutes  ses  applications,  y  coapris  la 
TAO. 

Selon  M.  Oreja,  secretaire  general  du 
Conseil  de  (’Europe,  les  langues  qui  ne 
s'industrialiseront  pas  cesseront  d'etre 


des  langues  vehiculaires ,  des  langues  de 
civilisation. 

La  projet  Eurolexic,  coordonne  par  ta 
France,  est  ije  creer  une  grammaire  et  un 
dictionnaire''4lectroniques  (J 'abort)  en 
quatre  langues  -anglais,  espagnol , 
frangais,  italien  -  les  autres  langues 
europtennes  devarit  suivre. 

On  constitue  alors  dans  chaque  langue,  et 
dans  le  cadre  general  des  industries  de  la 
langue,  une  base  de  connaissances 
linguistiques,  qui  n'est  pas  destinte 
specif iquement  a  la  TflO,  mais  dont  ta  TflO 
peut  bintficier,  tout  en  s'appuyant 
6ventuellement  sur  d’autres  outils  ou 
procedures  ou  algorithaes  develcppes  en 
coamun  ou  herites  d’autres  secteurs  tels 
que  la  recherche  docuaentaire,  l 'indexation 
autoaatique,  la  bibliometrie,  le  routage 
des  aessages,  les  systemes  experts,  par 
exemple  procedures  poiirisoter  les  palres, 
les  triplets,  coapter  les  occurences  et 
ponderer  en  consequence,  correler  pour 
deterainer  ou  analyser  le  contexte... 

Oe  tous  ces  lociclels  ou  linguiciels, 
l 'utilisateur  n'a  bien  cntenduqu'une  vue 
externe,  cetle  du  resultat  ct  /ies  deiais, 
et  son  juyement,  son  acceptation  ou  son 
refus,  vont  s'exercsr  sur  le  resultat  et 
la  facilite  qu'il  aur  a  evaluer  et  a 
utiliser  ce  resultat,  c’est-a-dire  une 
traduction  brute,  plus  ou  noins 
rebarbative,  qu’il  aura  a  rendre 
intelligible.  On  voit  ici  toute 
l 'importance  que  va  revltir  I'ergonoaie  de 
la  presentation  de  ces  resultats  et  des 
conditions  de  travail  qui  seiront  ofl'ertes 
au  traducteur  ou  a  l 'utilisateur  filial. 

Selon  les  cas  celui-ci  sera  appeie  a  reagir 
salt  a  posteriori  soit  en  ligne  et  en 
temps  reel  si  le  systene  est  interactif. 
Cette  interactivite  ne  peut  se  developper 
qu’en  interne  puisque  ce  n'est  qu’au  sein 
d’une  mSme  organisation  que  l 'on  peut 
veilter  a  creep  des  securites  d'emplai  pour 
eviter  des  entrees  contradictoires  de 
corrections  ou  d'additions,  de  bien 
paranetrrr  et  de  bien  repondre  aux 
questions  posees  par  Le  systeme,  ce  qui 
necessite  une  certaine  specialisation  et 
une  competence  linguistique, 

tiais  avec  les  gros  systemes  serveurs  de 
traduction  assistee  en  ligne  il  n'est  pas 
question  non  plus  que  chaque  utilisateur 
puisse  operer  librement  en  function  de  ses 
interets  propres,  compte  tenu  du  risque 
pour  les  autres  utilisateurs.  Navigation 
aids  ne  doit  pas,  du  Jour  au  lendemain, 

donner  5I0R...parce  qu’un  utilisateur  du 
secteur  biomedical  a  denande  une  inclusion 
traitre  sans  precautions... 

Pour  l  'uflUiateyr  d'un  systime  serveur  la 
plus  grande  frustration  vient  aussi  du 
temps  qui  s’ecoule;enir5  l 'envoi  de  ses 
remarques  et  leur  prise  . en'  compte. Une 
autre  frustration  vient -de  cptte 
insuffisance  actuelle  d'analyse  du 
contexte,  qui  laisse  le  systeme  fnpuissant 
devant  la  polysemie,  et  l 'insuffisance 
d 'expressions  courantes,  qui  devaient  avoir 
ete  entrees  conme  des  idiotismes  avec  leur 
equivalent  dans  l 'autre  langue,  expressions 
que  la  machine  est  tout  a  fait  incapable  de 
reconstruire  si'  l 'expression  n'a  pas  ete 
reconnue  et  transposee- en  bloc 


(ex:subsidiairement  et  toutes  choses 
egales  d'ail teurs. . . ) 

La  'oost-edition*ou  revision. 

Ce  terme  barbare  designe  la, phase  oil  le 
traducteur  rend  acceptable  et  intelligible 
un  resultat  oil  le  bon  se  »tle  au  mediocre 
d’une  fagon  telle  qu'Il  est  quelquefois 
difficile  de  demeier  les  fils.  Rutant  il 
est  aise  de  reperer  dans  un  original  ou 
dans  une  traduction* humaine  une 
insuffisance  de  style  ou  de  composition  ou 
un  contrescns,  autant  il  est  deiicat  de 
aettre  le  doigt  sur  les  insuf fisances 
d'une  traduction  brute  pour  y  localiser  des 
defauts  qui  peuvent  etre  de  toute  nature 
et  parfaitement  inattendus.  La  tache  etant 
particulierement  ingrate,  it  faut  done  tout 
faire  pour  ameiiorer  I'ergonomie  de 
l 'operation,  par  exemple  faire  apparaltre 
en  surbrillance  les  passages  od  la  machine 
a  hesite,  ou  a  mis  n'importe  quoi,  a  tout 
hasard...ou  encore  avoir  cote  a  cSte  le 
texte  source  et  le  texte  cible 
correspondant ,  avec  des  aides  au  reptrage 
de  chaque  phrase  et  un  systeme  de 
multifenetrage  pour  afficher  des  donnees 
fournies  par  le  dictionnaire,  et  pour 
pouvoir  consulter  des  passages  de 
traductions  anterieures.  Pour  Systran,  une 
excellente  analyse  des  conditions  de 
travail  et  de  l 'environnement  de 
l ‘utilisateur  est  donnee  par  Pigott  117) 
qui  montre  toute  l 'importance  d'un  certain 
nombre  dr  details  qui  conditionnent  le 
succes  ou  l  'echec. 


Pour  mieux  comprendre  et  analyser  les 
raisons  du  succes  que  la  TRD  commence  a 
rencontrer,  il  faut  en  arriver  enfin  a 
parler  du  traducteur  ou  de  l 'utilisateur 
final  puisque  l 'envlronnrsent  commence  et 
finit  par  l 'utilisateur.  Nous  avons  vu  plus 
haut  que  celui-ci  avait  ete  maladroitement 
tenu  d  I'ecart  des  developpements  de  la 
TflO,  alors  qu'il  aurait  pu  apporter 
beaucoup  s'il  avait  ete  etroitement 
associe.  Il  faut  dire  que  de  son  cOte  le 
traducteur  a  du  mal  a  se  faire  entendre, 
dans  une  profession  mal  structure,  mal 
representee . 

Une  enquete  conduite  en  1986  (16)  reveie 
qu’on  trouve  en  effot  une  grande  majorite 
de  traducteurs  independants  (free-lance), 
travaillant  seuls,  comme  des  artisans. 


Ils  ont  indique  le  temps  qu’ils 
consacraient  a  chaque  type  de  tache: 

-10  a  15  i  a  des  recherches  de  terminologie 


-50  Venviron  a  ta  traduction  proprement 
di  te 

-15  %  a  la  *post-edi tion'et  a  la  mise  rn 
forme  en  vue  de  l 'impression. 

La  mtme  enquete  montre  que  la  plupart  des 
textes  sont  techniques  ou  commerciaux. 

Ce  teyte  arrive  sous  forme  dactylographi6e 
dans  51  T  des  cas,  imprimte  dans  33  1  des 
cas,  mahuscrit  £7%),  sur  disquette  ou 
bande  magnetiquv-ou  par  teiechargemer.t  (4 


L'utilisateur 


-10  %  a  la  preparation  du  texte  ou 
■pre-edition* 


2-10 


l),  sur  support  audio(0,S  »). 

It  est  evident  qu'avec  la  generalisation  du 
traiteaent  de  texte  et  de  I'tdition 
tlectronlque  le  texte  sera  dans  bien  des 
cas  dtja  nuatrist  et  ‘propre1,  grace  aux 
correcteurs  orthographiques  ou  autres 
outils  dlsponibles  en  anont,  de  sorte  que 
les  prt-traiteaents  wont  se  trouwer 
fortoaent  alleges  et  leur  coQt  redult.  II 
est  ctair  aussi  que  la  generalisation  du 
recours  aux  banques  de  terainologie  va 
aussi  penetrer  l 'environneaent  du 
traducteur,  d'oO  reductions. a  prevoir  sur 
les  10  %  correspondents* 

II  apparatt  que,  sans  la  TRO,  c’est  50  %  du 
teaps  du  traducteur  qui  est  passe  a  la 
traduction  propreaent  dite,  et  15  X  dans  la 
phase  qui  suit.  L’ intervention  de  la  TRO? 
atae  si  ella  alourdit  la  ‘post-edition*  en 
augaentant  les  15  *  reduit  tres 
sensibleaent  les  SO  \.  L'evolution  sera 
relativeaent  rapide  puisque  I'enquite 
reveie  que  plus  de  50  1  des  traducleurs 
utilisent  deji  en  1986  un  aicroordinateur 
personnel  pour  le  traiteaent  de  texte.  Dans 
ces  conditions  on  voit  aal  coaaent  certains 
pourront  reculer  devant  l ‘adjunction  d'un 
bon  logiciel  de  TRO. 


Conclusion; 

Rpres  ce  parcours  un  peu  sinueux  dans 
l 'environneaent  technique  de  la  TRO,  quels 
sont  les  point  essentiels  qui  aeritent 
d'etre  regroupes  et  retenus?  essayons  de 
les  enuaOrer: 

La  tro  n'est  qu'une  coaposante  de 
I'industrie  de  la  langue.  Elle  n'est  plus 
une  activity  isolee  aais  peut  beneficier  de 
recherches  et  de  developpeaents  d'autres 
segaents  de  cette  Industrie. 

Le  aarche  est  iaportant  aeae  s’il  est 
difficile  8  evaluer.  II  existe  divers 
types  de  besolns  et  de-aarchts: 
docuaentation  technique,  serveurs 
d ' inforaat ion  en  ligne,  applications  liees 
8  la  synthese  et  8  la  reconnaissance  de  la 
parole  et  8  l 'interpretation  siaultanee. 

Ici  coaae  pour  la  traduction  il  s'agit  de 
ne  pas  oublier  de  s'assurer  le  concours  des 
interpretes,  dont  tes  aecanisaes  aentaux, 
en  siaultanee  et  en^consecutive,  qui 
isolent  les  ideas,  les  concepts,  au-del8 
des  aots,  en  interdisant  pratiquement  le 
contresens,  peuvent  utlleaent  etre 
analyses. 

Chaque  nation,  ou  chaque  groupe  de 
nations  d'une  aeae  langue,  est  aaenee  8 
•industriallser1  sa  langue  pour  pouvoir 
conserver  sa  culture  et  son  rayonneaent. 

II  ne  faut  pas  attendre  de  la  recherche  un 
produit  airacle  qui  dispenserait  de  tout 
effort  de  developpeaent  de  la  part  des 
utilisateurs.  La  solution  est  certaineaent 
dans  un  regroupeaent  de  ces  utilisateurs 
pour  que  ce  developpeaent  soil  effectue  en 
coaaun,  et  done  8  aoindres  frais. 

II  faut  integrer  au  aieux  la  traduction 
dans  la  chatne  de  traiteaent 
docuaentaire  et  iavoriser  la  nuaerisation 
8  la  source  en  evitant  le  retour  au  support 
papier  lorsque  ce  support  n'est  pas 
indispensable. 


II  faut  faciliter  la  tache  de 
l 'utilisateur  final  et  notaaaent  du 
traducteur  en  veltlant  8  une  bonne 
ergonoaie  aux  niveaux  ou  celui-ci 
intervient.  II  faut  aussi  savoir  et  faire 
savoir  que  la  TRO  ne  conduit  pas  8  la 
dlsparition  des  traducteurs  aais  qu'elle 
aodifie  leurs  conditions  de  travail  dans 
le  sens  d'une  aide  de  plus  en  plus 
prtcieuse,  obtenue  8  leur  profit  etgrSce  8 
eux,  sous  reserve  que  soit  bien  itudite 
I'ergonoaie  des.  nouveaux  postes  de  travail. 

IL  faut  peraettre  une  prise  en  coapte 
rapide  et  aiste  des  reaarques.de 
l 'utilisateur  pour  qu'il  accepte  de 
partlciper  8  1‘tvolution  du  systeae,  en 
constatant  des  aaeiloratlons. 

Dans  le  contexte  dela  TRO  il  faut 
favoriser  aussi  touts  foraule  qui  contribue 
8  la  reduction  des  barrieres  linguistiques , 
aeae  lorsqu'll  ne  s'agit  pas  de  TRO  stricto 
sensu,  et  aeae  si  les  objectlfs  sont  trts 
aodestes:  consultation  en  ligne  et  aise  8 
jour  frequente  d'outils  linguistiques  tels 
que  dictionnaires  classlques  ou  thesaurus 
aultitingues,  systeaes  d'extraction  de 
concepts,  d.'indexation  et  d'analyse 
autoaatique,  ou  de  balayage  rapide  des 
textes,  ou  de  routage,  ou  d'interrogation 
de  banques  de  donnees  8  partir  d'une  autre 
langue. .d'autant  plus  que  les  progres 
realises  dans  ces  doaaines  rejailliront  sur 
la  TRO,  qui  en  beneflcie  deja  largeaent.  La 
base  DTI  (World  Transindex)  qui  rasseable 
des  signaleaents  de  traductions  effectuees 
dans  le  aonde  (300000  references  depuis 
1977)  et  rend  ainsi  les  traductions 
beaucoup  plus  accessibles  aerite  ici  une 
aention  speciale. 

Il  faut  liaiter  Le  secteur  CDUvert  par  le 
systeae  de  TRD,  et  s'il  n'est  pas  possible 
de  se  liaiter  8  un  secteur,  developper  les 
lndicateurs  de  context's  qui  font  encore 
defaut,  et  ceci  grace  au  recours  8 
l ' intelligence  artif icielle. 

Il  faut  agir  en  tenant  coapte  de 
l 'existence  de  plusieurs  sortes 
d'utillsation,  done  de  plusieurs  aarchts  de 
la  TRO,  distinct  et  ntcessitant  des 
environneaents  propres,  et  acceptes  par  les 
utilisateurs:  par  exeaple: 

-systtaes  serveurs  de  traduction  en  ligne 

-systtaes  inttgrts  8  une  chatne  de 
traiteaent  docuaentaire  8  I'intfrrieur  de 
I’entreprise 

-systtaes  8  l ’usage  individual  du 
traducteur  indtpendant . .etc. 

et  enfin  il  faut  ne  pas  se  aontrer  trop 
optiaiste  ni  trop  sceptique,  et  savoir 
qu'il  faudra  en  cette  aatitre  un  effort 
soutenu  et  un  investisseaent  en  relation 
avec  la  nature  de  I'enjeu. 

oooOooo 
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Figure  2.  TRUM  METEO:  une  simplicity  bibllque. 
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Figure  5:  Extrait  du  fichier  PASCAL 
da  I'INIST.  ExempLe  d'lndexation  muttllinoue 
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litus  repose  sur  une  methode  particuliere  de  traduction  dite  a 
syntaxe  contrfliee,  qui  n'autorise  que  I’pmplot  Ue  formes 
d'expression  obOissant  a  certains  critOres  l inguist iques 
restreints  et  predetermines.  Les  regies  syntaxiques  acceptees  par 
Titus  sont  tout  a  fait  naturelles  et  des  plus  classiques  dans 
chaque  langue.  Les  phrases  ne  doivent  contenir  que  des  termes 
figurant  dans  un  diet ionnaire  prealablement  etabli.  Chaque  phrase 
est  testae  sur  sa  validite  syntaxique  et  lexicale,  Toute  erreur 
ou  toute  ambiguite  est  signaiee  par  t 'af f ichaged'un  message  sur 
t'ecran  du  terminal  utilise.  En  cas  de  polysemie,  I'operateur 
choisit  la  signification  qui  convient.  Titus  a  ete  specialement 
congu  pour  le  traitement  multilingue  des  bases  de  donnees 
scient if iques  et  techniques. 
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ORIGINAL 


Contribution  for  DG  XIII  brochure 
EUROTRA 


EUROTRA  is  a  Community  research  and 
development  programme  f or  the 
creation  of  a  machine  translation 
system  of  advanced  design  capable  of 
dealing  with  all  the  official 
languages  of  the  EC.  It  was  adopted 
by  Council  Decision  82/752/EEC  of  4 
November  1982  and  extended  by  Council 
Decision  86/591/EEC  of  26  November 
1986  to  include  Spanish  and 
Portuguese  following  the  accession  of 
Spain  and  Portugal. 

The  programme  is  Jointly  financed  by 
the  Community  and  its  member  States. 
Its  objective  is  the  creation  of  a 
prototype  system  which  would  be 
operational  for  a  limited  subject 
field  and  for  a  limited  number  of 
text  types  with  a  vocabulary  of 
approximately  20.000  entries.  This 
will  provide  the  basis  for 
development  on  an  industrial  scale  in 
the  period' following  the  current 
programme.  In  addition  EUROTRA  aims 
at  creating  in  Europe  a  "critical 
mass"  of  expertise  in  machine 
translation  and  computational 
linguistics  in  general. 

EUROTRA  is  a  seven-year  programme 
divided  into  three  phases,  each  with 
its  own  tasks  and  objectives: 

A.  The  preparatory  phase  (two  years) 
during  which: 

(1)  the  organizational  arrangements 
for  the  project  were  agreed, 

(2)  the  linguistic  and  software 
specifications  were  defined. 
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Contribution  pour  la  brochure  de  DG  XIII 
EUROTRA 


Eurotra  est  un  programme  communautaire 
de  recherches  et  de  ddveloppement  pour 
la  crdatlon  d'un  systems  de  traduction 
automatlgue  de  conception  avanede 
capable  de  traitor  de  toutes  les 
langues  offlclelles  de  la  CE.  II  a  did 
adoptd  par  la  ddcleion  82/752/EEC  du 
Consell  du  4  novembre  1932  et  dlargi 
par  la  ddcleion  86/591/EEC  du  Conseil 
du  26  novembre  1986  pour  comprendre 
espagnol  et  portugals  aprds  l'adhdsion 
de  l'Espagne  et  du  Portugal. 

Le  programme  est  conjolntement  finaned 
par  la  Communautd  et  ses  Etats  membres. 
Son  objectif  est  la  crdatlon  d'un 
systdme  prototype  qui  seralt 
opdrationnel  pour  un  domains  llmitd  et 
pour  un  nombre  llmitd  de  types  de  texte 
avec  un  vocabulalre  d 'approximatlvement 
20,000  entrdes.  Ceci  fournlra  la  base 
pour  le  ddveloppement  sur  une  dchelle 
industrlelle  pendant  la  pdriode  apres 
le  programme  actuel.  En  plus  EUROTRA 
vise  d  order  en  Europe  une  "masse 
critique"  des  connaissances  dans  la 
traduction  automatique  et  de 
linguistique  computationnelle  en 
gdndral. 

Eurotra  est  un  programme  de  sept  ans 
divisd  en  trois  phases,  chacun  avec  ses 
propres  t&ches  et  objectlfs: 

A.  La  phase  prdparatolre  (deux  ans) 
pendant  laquelle: 

(1)  les  dispositions  organisatlonnelles 
pour  le  projet  ont  dtd  conventies, 

(2)  linguistiques  et  les  spdeif lcations 
de  logiciel  ont  dtd  ddfinies. 
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Humans  have  not  only  the  ability  to  parse 
sentences  but  also  the  ability  to  recognise 
strings  as  ill-formed." 

Flickinger,  Nerbonne,  Sag  &  Wasovi 


Essai  de  typologie  des  textes  source,  dons  le  cadre  de  la  traduction  assistAe  par 
ordinateur 


par 

lean  Gordon 

Directrice  des  Services  centraux 
Direction  gcnArale  des  operations  de  traduction 
Longues  officiolles  et  Traduction 
Secretariat  d’Etat  du  Canada 
Ottawa  (Ottawa) 
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SOMMAIRE 

Le  recours  a  la  traduction  assistAe  par  ordinateur  est  une  decision  reflAchie. 
Tous  les  factcurs  doivent  etre  prls  en  consideration.  Une  typologie  des  textes  permet  de 
determiner  la  categoric  A  laquelle  ils  apparticnnent  et  lour  degrA  de  "tnoisabilite" .  Elle 
se  fonde  sur  les  caracteristiqucs  materielles  (le  support  utilise  et  sa  compatibiiite ) , 
les  caracteristlque  terminologi ques  (la  variAte,  la  complexite,  et  la  stability  du 
vocabulairo)  et  les  caracterlstiques  styllstlques  (la  presence  de  certains  traits 
linguistiqucs  qui  affectent  la  qualite  des  r6sultats).  Cet  examen  du  profil  des  documents 
sert  d’outil  de  selection  et  de  tri  et  il  doit  etre  nssorti  d’une  analyse  approfondie  avant 
la  decision  finale. 


SUMMARY 

The  question  of  whether  or  net  to  use  computer-assisted  translation  requires 
serious  thought.  All  factors  must  be  taken  into  consideration.  A  typology  helps  to 
identify  the  category  into  which  texts  fall  and  the  extent  tc  which  they  lend  themselves 
to  this  approach.  The  typology  is  based  on  material  considerations  (medium  used  and 
compatibility),  terminological  considerations  (variety,  complexity  ans  stability  of 
vocabulary)  and  3t  .istic  considerations  (presence  of  certain  linguistic  characteristics 
that  affect  the  quality  of  results).  This  examination  of  the  profile  of  documents  is  a 
sort  of  pre-screening  tool.  A  thorough  analysis  should  also  be  conducted  before  any  final 
decision  is  tuken. 


Introduction 

Tout  texte  peut  etre  traduit.  Pour  un  service  de  traduction,  qui  ne  rejette 
pas  les  documents  pour  cause  do  rhAtorique,  d’emphase  ou  d ’orthographe ,  tout  texte  doit 
etre  traduit. 

Mais  si  le  traducteur  humnln  peut  affronter,  fut-ce  au  prix  d’une  perte  de 
productivity,  le  traitcraent  de  documents  aux  caracterlstiques  les  plus  diverses,  la  machine 
on  est  Incapable. 

II  est  Admis  que  le  traducteur  profess ionnel  doit  satisfaire  aux  exigences  des 
organismes  ou  des  clients.  Des  outils  ont  AtA  mis  au  point  qui  permettent  de  sAlectionnor 
colul  qui  saura  le  mioux  rApondre  A  ces  besoins. 

La  situation  est  touto  diffArente  en  ce  qui  concerne  l’ordinateur  et  les 
logiciels  de  traduction  assistee.  D’abord,  aucun  fabricant  sArieux  ne  prAtend  que  son 
systAme  pout  tlraduire  indiffAremment  tout  ce  qui  passe.  Ensuite,  la  plupart  des 
organisations  n'ont  pas  d’outils  ou  d’examens  type  A  fairo  passer  aux  systemes 
commercialises  actuels,  ou  A  ccux  que  nous  promet  l’avenir.  Et  avant  memo  de  choisir  un 
systAme,  il  faut  d’abord  determiner  si  les  documents  A  traiter  se  pretent  A  ce  genre  de 
traduction.  En  effet,  1 ’introduction  d'un  systeme  impos.e  des  contraintes  A  1  ’ organisation. 
SI  elle  ne  beuleverse  pas  toujours  l'organisation  et  les  mAthodes  de  travail,  elle  oblige 
au  moins  A  des  amAnagements .  Elle  suppose  aussi  un  i nvestissement  plus  ou  moins  important. 

L’Avaluation  du  caractArc  taoisable  des  textes  visAs  prend  done  toute  son 
importance  lors  de  la  dAcision  initialc,  lorsqu’il  faut  rApondre  A  la  question:  La  tao 
convient-elle  aux  textes,  ou  encore,  les  textes  se  preten,t-ils  A  la  tao? 

Le  prAsent  essai  de  typologie  des  documents  aux  fins  de  la  tao  n’a  pour  objet 
que  de  faciliter  leur  tri,  une  premiere  sAlection.  S'il  est  utile,  ce  serait  au  me(ae  titre 
que  les  outils  de  prA-sAlection  des  candidate  qui  se  prAscntent  a  un  poste  de  troduc-teurs. 
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Ceux  qm  repondent  au  profil  6tabli  sont  admis  k  1’examen.  De  meme,  cette  typologie 
sommalre  vise  &  faclliter  i 9 etablissement  du  profil  des  textes  dans  JL'optique  de  leur 
taoisation  eventuelle,  avant  une  evaluation  plus  approfondie  assortie  d’essais. 


Les  facteurs  decrits  ci-dessous  ont  et6  rctcnus  on  raison  de  leur  influence  sur 
les  resultats  de  la  traduction  autoraatiquc  ou  de  la  traduction  assistee  par  ordinateur, 
a  partir  d’un  textc  source  anglais  et  pour  un  texte  cible  franqais.  Ces  resultats  ont  6te 
constates  au  cours  d’essais  de  plusieurs  mois  avec  trois  systemes  commerciaux,  a  savoir 
ALPS  et  Systran  (CEE),  au  Secretariat  general  de  I’OTAN  a  Bruxelles,  et  LOGOS  au 
Secretariat  d’Etat  du  Canada.  Ils  sont  confirmes  par  des  analyses  effcctuees  par  ailleurs 
ou  documentees  dans  les  ouvragos  mentionnes  en  reference. 

Le  volume  k  traduire  n’a  pas  6te  pris  en  consideration.  Toutefojs,  ce  critere 
doit  peser  tres  lourd  dans  la  balance  lorsqu'on  analyse  la  rentabilite  de  la  tao,  ou  de 
sa  rentabil isation  eventuelle. 

Trois  typos  de  facteurs  influent  sur  la  traduction  assistee  par  ordinateur  et 
sa  qualite,  a  savoir  1 'aspect  materiel  ou  technique,  la  terminologie  et  les 
caracter istiques  stylistiques. 

Facteurs  materials  ou  techniques 

II  y  a  peu  d’annces,  les  specialistes  tenaient  que  tons  les  documents  k  taoiser 
devaient  etre  dispohibles  sur  support  magnGtique. 

C’est  une  condition  essentielle,  puisque  1 ’ordinateur  ne  peut  lire  que  la  forme 
ordinol ingue .  Ce  n’est  pas  une  condition  suffisante,  puisque  1 'ordinateur  nc  peut  pas  tout 
lire,  si  la  conversion  est  mauvaise  ou  impossible.  A  l’heure  de  la  multiplicity  des  postes 
de  travail,  et  des  systemes,  il  convient  d'etre  plus  specifique.  11  ne  suffit  plus  que 
le  texte  source  ait  ete  dactylographie  sur  un  ordinateur  ou  avec  un  traitement  de  textes 
quelconque,  encore  faut-il  quo  le  fichier  aoit  convertible  et  que  les  codes  de  formatage 
ou  autres  le  soient  nussi. 

Depuis,  1’apparition  de  lccteurs  optiques  tres  performants,  avec  capacite 
d’apprentissage,  permet  une  transition  eventuelle  entre  systemes,  ou  encore  la  taoisation 
de  textes  pour  lesquels  des  disquettes  ne  sont  pas  disponibles  ou  ne  le  sont  plus. 

Ce  facteur  technique,  souvent  sous-cstime,  a  un  impact  considerable  sur  les 

operations. 


II  faut  souligner  en  particulier  les  problemes  que  peuvent  causer  lors  de  la 
conversion  les  codes  relatifs  aux  tableaux,  aux  graphiques  et  meme  a  la  disposition  en 
colonnes.  Si  une  solution  technique  eprouvee  n’est  pas  disposable,  cette  caractSristique 
doit  etre  prise  en  compte  lors  de  1 'analyse  linguistique  des  documents. 

Une  conversion  imparfaite  ou  des  problemes  constants  de  formatage  dimmuent  la 
productivity  et  rnllongent  les  delais  de  traduction  et  de  livraison  des  textes.  En  outre, 
meme  si  la  compatibil i te  existe  en  theorie,  des  regies  precises  de  dactylographie  et  de 
formatage  doivent  etre  repectees.  Ces  regies  different  fort  peu  des  regies  de  traitement 
de  textes,  cependant  1’ experience  a  reveiy  que  celles-ci  sont  souvent  oubliees  par  le 
personnel  de  soutien,  si  seule  la  sortie  papier  est  importante.  Pour  pallier  h  cet  oubli, 
des  stances  d ’ information  ont  etc  mises  au  point  au  Secretariat  d’Etat  du  Canada  pour  les 
personnes  appelces  a  dactylographier  les  textes  source  des  projets  de  tao. 

Si  les  conditions  materielles  ne  sont  pas  bonnes,  il  est  preferable  de  renonerr 
&  introduire  la  tao,  ou  du  moins  de  reporter  son  implantation.  Sauf  dans  des  clrconstances 
except ione lies,  ce  facteur  est  done  eliminatoire . 

Facteurs  tcrminologiques 

Dans  le  contexte  de  la  tao,  les  facteurs  tcrminologiques  ont  un  impact 
considerable . 

En  effet,  les  systemes  sont  generalement  livres  avec  un  dictionnaire  de  base. 
Mais  commc  tout  traducteur  qui  se  respecte  ne  peut  se  limiter  au  contenu  d’un  seul 
dictionnaire  general,  il  ne  saurait  utiliser  la  tao  avec  le  dictionnaire  standard.  Le 
dictionnaire  du  systfcme  doit  done  etre  enrichi  de  la  terminologie  propre  au  domaine  et  a 
1’ institution. 

Si  cette  operation  d’entree  dans  le  dictionnaire  est  parfois  tr&s  simple  et  trbs 
rapide,  elle  peut  aussi  exiger  une  codification  complexe  ou  de  multiples  entrees.  Un  terme 
peut  prendre  quelques  minutes,  ou  plusieurs  heures  s’il  faut  en  valider  le  codage. 

Dans  un  premier  temps,  il  s’agira  sans  doute  de  mots  ne  figurant  pas  du  tout 
au  dictionnaire,  signales  par  une  recherche  de  mots  nouveaux  dans  un  corpus  plus  ou  moins 
ctendu. 
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Cette  recherche  est  faite  avant  la  traduction.  Sauf  pour  les  verbes  dont  le 
codage  est  fort  comploxe  et  doit  gene  element  etre  confie  aux  specialistes,  l’entree  de 
mots  nouveaux  est  facile.  En  outre,  memo  dan3  un  domhine  qui  n’a  pas  dej&  etd  traitd  par 
la  machine,  leur  proportion  est  relativoment  faible.  A  titre  d’oxemple,  pour  une  page  de 


595  mots  choisie  au  hasard  dans 


(Marius 


Barbeau,  Secretariat  d’Etat,  Ottawa  1964),  Logos  a  recense  vingt-trois  mots  non  trouvds 


( 001E) 


au  dictionnaire,  dont  sept 

sont  des  noi 

Si ’aks 

001 

stench 

001 

tsawltsap 

001 

volcano 

002 

Weohawn 

001 

Wigyidcmrhsaek 

001 

Dans  un  deuxieme  temps  doivcnt  etre  identifies  les  mots  utilises  dons  une 
acceptation  differente  de  celle  du  dlctionnaire  standard  mais  propre'  au  domaine  ou  a 
l’organisme  vise.  Cette  etape  est  posterieure  a  un  premier  passage  machine  qui  permet  de 
reperei-  les  ecarts.  Dans  deux  phrases  du  passage  cite  ci-dessus,  "traducteur”  et  "spot" 
ont  un  equivalent  tire  du  domaine  informatique  et  devraient  etre  remplacds  dans  le  domaine 
pertinent: 


William  Beynon  acting  as  interpreter. 

Guillaume?  Beynon  qui  fait  fonction  du  traducteur.  (002E) 

The  guide  pointed  to  a  spot  nearby. 

Le  guide  a  indique  un  spot  tout  pres.  (003E) 

En  trolsi&me  lieu,  il  faut  affronter  le  ph^nomcne  de  la  polysdmie 
intrasectorielle,  qui  augraente  avec  la  complexity  du  domaine.  En  effet,  il  y  a  une  limite 
aux  distinctions  entries  dans  le  systemc  et  on  arrive  vite  au  point  de  saturation.  Aussi, 
de  tous  les  problemes  terminologiques ,  celui  de  la  polysdmie  est  l’un  des  plus  complexes, 
et  les  divers  logicicls  ti’ont  pas  encore  trouve  d’armc  absolue. 

Cancel  the  check. 

Annuler  le  cheque. 

Annuler  la  verification.  (004E) 

L’ investissement  terminqlogique ,  equivalent  au  travail  du  traducteur  qui  dans 
un  autre  contcxte  consigne  ses  recherches  sur  fiche,  peut  exiger  beaueoup  de  temps  et 
d’efforts  dont  les  excmples  tres  simples  fournis  ci-dessus  ne  donnent  qu’une  idee 
partielle.  Ce  travail  sera  plus  ou  moins  long  suivant  l’etendue  du  vocabulaire  utilise, 
ellc-meme  fonction  du  type  de  textes  et  de  l’ampleur  du  domaine.  Ainsi,  la  terminologie 
d’un  corpus  compose  de  comptes  rendus  de  reunions  portant  sur  le  transport  des  explosifs 
sera  bien  plus  limiteo  que  celle  d’un  corpus  raixte  de  comptes  rendus,  d’articles  et  de 
rapports  sur  le  memo  sujet,  elle  meme  depassee  par  I’ampleur  d’un  corpus  semblable  sur  la 
mecaniquo. 


Si  l’on  se  fonde  uniqucment  sur  la  delimitation  du  domaine  pour  determiner  la 
variyte  probable  de  la  terminologie  du  corpus  dont  la  taoisation  est  envisages,  il  convient 
d’en  evaluer  ogalement  1’uniformlte  du  vocabulaire.  Pour  le  memo  domaine,  1’ intervention 
d’ auteurs  differents  peut  entrainer  une  instability  de  la  terminologie  ou  introduire  des 
variantes  qui  n’ont  pas  d’ impact  sur  la  comprehension  du  texte  par  le  lecteur  humain,  mais 
affectent  la  traduction  machine.  Or,  la  stability  ou  la  coherence  de  la  terminologie 
rentabilise  plus  rapidement  le  temps  consacre  A  chaque  entree  au  dictionnaire  et  la  quality 
de  la  traduction  subsequent^  s’en  trouve  ameiiorce. 

Un  succes  de  la  traduction  automatique  souvent  cite,  c’est-A-dire  le  systeme 
METEO  au  Canada,  est  notamment  attribuable  a  la  terminologie  limitee  du  sous-langage  des 
bulletins  moteoroldgiqURs . 

Imposcr  un  langage, controie  est  une  fagon  d’obtenir  la  stability  recherchee  et 
certains  organismes  ont  eu  recours  a  cette  methode.  Dans  certains  contextes 
orgdnisationnels ,  elle  est  copendant  impossible  A  appliquer.  Avec  la  generalisation  des 
postes  de  travail  Chez  les  cadres,  1 ’ utilisation  d’un  dictionnaire  commun  serait  un 
compromis  a  envisager. 

Plusieurs  methodes  permettent  d’cvaluer  le  facteur  terminologique.  De  fagon 
gcnerale,  on  peut  presumer  que  plus  un  domaine  est  circonscrit,  plus  la  terminologie  est 
iimitee  et  le  sous-langage  claireraent  defiqi.  On  peut  proceder  de  fagon  empirique  et 
analyser  les  textes  disponibles,  ou  se  fier  A  sa  connaissancc  approfondie  de  la  demande. 

Pour  completer  une  evaluation,  des  outils  sont  disponibles  qui  mesurent  la 
variety  terminologique  d’un  corpus.  11s  s’ apparentent  aux  listes  de  mots  en  contexte 
utilises  par  les  juristos  (KWIC).  Certains  fabricants  de  tao  ont  intygrd  ces  logicicls  et 
offrejit  des  listes  de  frequence,  avec  du  sans  contexte. 
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Facteurs  stylistiques 

La  derni&re  analyse,  celle  du  style,  porte  sur  une  caractdristique  beaucoup 
plus  difficile  a  cerner  et  a  definir. 

Les  typologies  d’evaluation> des  systfemes  de  traduction, 'et  les  nombreux  essais 
et  exemples  disponibles,  donnent  un  tableau  assez  complet,  mais  sans  doute  pas  exhaustif, 
des  embuches  et  des  probl&mes  que  posont  encore  les  textes  source. 

II  est  communement  admis  que  certaines  categories  de  documents,  au  style  tres 
personnalise,  mais  aussi  au  vocabulaire  dif f icilement  limitable,  ne  se  pretent  pas  a  la 
tao.  Aussi,  les  caracteristiques  retenues  et  decrites  ci-dessous  ne  sont  nullement 
pertinentes  dans  le  cas  d’ouvrages  littdraires  ou  de  textes  juridiques  ou,  si  les  erreurs 
, sont  en  principe  rares,  le  style  est  souvent  recherche,  sinon  dense. 

En  ce  qu‘  oncerne  les  textes  techniques,  scientif iques  ou  administratifs,  et 
plus  particuliereroeht  les  textes  de  type  informatif ,  leurs  caracteristiques  peuvent  etre 
cerndes  de  faqon  generale  et  l’on  peut  parler  de  textes  bien  structures  ou  bicn  ecrits, 
idiomatiques  ou  rigoureux. 

L'analyse  linguistique  decrite  ici,  de  portde  fort  modeste,  apporte  une 
dimension  linguistique,  essentielle  a  une  typologie  d’evaluation  de  corpus  aux  fins  de  la 
tao.  Conque  de  faqon  pragmatique,  elle  peut  porter  sur  un  volume  variable  de  textes 
disponibles;  sa  fiabilite  est  toutefois  directement  fonction  non  pas  tant  du  volume  de 
textes  analyses  mais  de  leur  representativite. 

Aux  fins  du  present  essai  de  typologie,  seules  quelques  caracteristiques 
lingulstiques  ont  dte  retenues.  D’autres  pourraient  etre  ajoutees  et  le  modele  raffine. 
Cellos  enumerees  ici  sont  celles  qui  ont  un  impact  immediat  sur  la  qualite  et  prdsentent 
l’interet  d’etre  faciles  a  reconnaitre.  Elies  peuvent  etre  utilisdes  par  une  personne 
etrangere  a  la  tao,  et  un  traducteur  habitue  a  un  systdme  devaluation  de  la  traduction 
n’a  aucun  mal  a  les  appliquor. 

Ces  caracteristiques  ont  toutes  un  trait  commun  :  leur1  presence  a  un  effet 
nbgatif,  soit  qu’elle  pose  des  problemes  au  niveau  du  dictionnaire  par  la  creation 
d'homographes,  soitqu'elle  fausse  l’analyse  ou  la  compliquc.  Les  unes  sont  considdrees 
par  les  grammairiens  et  les  linguistes  comme  des  fautes.  Les  autres  sont  des  particularites 
stylistiques  parfaitement  acceptables,  ou  raeme  reclierchees. 

Certains  pourraient  s’etonner  de  voir  y  figurer  en  bonne  place  des  erreurs 
d 'orthographe ,  de  grammaire  ou  d’usage.  Cette  inclusion  reconnait  la  renlite  du  monde 
imparfait  des  redacteurs  presses,  ou  qui  n’dcrivent  pas  dans  leur  langue.  Les  exemples 
donnbs  sont  tires  de  textes  reels. 

Les  fautes  d’orthographe,  auxquelles  peuvent  etre  assimilees  les  fautes  de  frappe 
et  coquilles,  creent  des  mots  inconnus,  non  traduisibles : 

...  metal,  cables,  elements  ect. 

du  metal,  des  cables,  des  eldments  ?ect.  (303) 

He  has  responsability  . . . 

II  a  le?  responsability  ...  (L810) 

Pis  encore,  en  remplaqant  un  mot  par  un  autre,  ellcs  font  derapor  l’analyse  en 
ddguisant  un  verbe  en  un  nom,  ou  un  article  en  un  verbo: 

One  of  the  functions  is  each  category  is  designated  ... 

Une  des  fonctions  est  est  designde  cheque  categoric ...( 71 1 

Les  fautes  d'accord  les  plus  groves,  aux  fins  de  la  taoisation,  sont  cexles  du 
sujet  et  du  verbe;  plus  frequentes  que  l’on  n’ imagine,  elles  faussent  gtneralement 
l’analyse  si  elles  creent  une  ambigu'fte  quant  au  sujet  du  verbe: 

This  group  of  students  wish  to  visit  the  museum. 

Ce  groupe  du  souhait  d’dleves  pour  visiter  le  musee.  (005E) 

If  either  A  or  B  wish  to  go  out,  we  will  do  this 

Si  souhait  A  ou  B  pour  sortir,  nous  ferons  ceci.  (006E) 

Les  erreurs  que  des  reviseurs  ou  des  dvaluateurs  linguistiques  eussent 
sanctionnees  ne  figurent  pas,  rappelons-le ,  parmi  les  fautes  a  relever,  si  elles  n’ont  pas 
d’effet  ou  ont  au  contraire  un  effet  bendfique  on  clarifiant  des  liens  grammatienux.  C’est 
le  cos  de  certains  gellicismes  ou  encore  de  complements  de  nom. 

Dans  une  toute  autre  categorie  tombent  les  ellipses  et  omissions.  Fautives  ou 
justifiees,  elles  ont  un  effet  nefaste  incontestable. 

L’omission  de  l'article  devant  un  nom  peut  creer  un  homographe  stylistique,  et 
en  1'absence  de  1 ’ information  voulue,  le  logicrel  traduira  un  nom  par  un  verbe,  ou 
inversement: 

Faint  surface  between  lines. 

Surface  de  peinture  entre  les  lignes.  (007E) 

Fill  vase  with  water. 

Vase  de  plein  ovec  l’eau. 


(008E) 
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L’omission 
stylistique,  avec  lc 
To  produce 
Le  produit 


du  pronom  sujet  devant  un  verbe  peut  egalement  creer  un  homographe 

memo  genre  de  consequence  que  ci-dessus: 

reports. 

rapporte,  (267P) 


Les  autres  typos  d’ omissions  sont  moins  repandues.  Elies  peuvent  etre  relevees 
quand  memo,  mais  lour  effet  semble  plus  aleatoire. 

if,  however,  a  function  other  than  the  prime  desired. 

Si,  toutefois,  une  fonction  autre  que  l’apogee  voulue. 

(73) 

Close  the  back  and  the  front  door. 

Fermer  le  dos  et  la  porte  de  dovant.  (009E) 

How  to  turn  on  and  off  the  motor. 

Comment  tournor  le  moteur  de  marche/arret.  (010E) 


L’utilisation  des  prepo3it  ns  est  a  relever.  La  plupart  des  logiciels  ont 
integre  les  regies  ou  les  constructions  idiomatiques  les  plus  courantes.  Cepa..dant,  la 
,’ariete  des  formes  adverbiales  consistent  d’un  verbe  et  d'une  preposition  est  infinie  et 
chaque  auteur  peut  en  ajouter  a  sa  guise.  Les  regies  ndeessaires  se  contredisent  alors, 
ou  creent  des  unites  lexicales  contradictoires,  avec  un  resultat  a  l’avenant: 

The  sytem  manager  can  go  on  to  delete  several  entries 

Le  gestionnaire  du  systerae  peut  etre  supprimd  continuer  plusieurs  entrees 

(267P) 


Une  caractdristique  particulierement  meurtriere  est  l'empilage  de  complements 
ou  d’attributs,  de  noms  ou  d’adjectifs,  dont  la  presence  semble  inciter  les  systemes  h  des 
distributions  qu’on  pourrait  penser  aleatoires: 

The  site  level  troubleshooting  tasks  are  essentially  performed... 

L’ installation  nivelle  est  essentiollement  execute  les  taches  de  depannage.... 

(258) 

(The  troubleshooting  tasks  at  site  level  are  essentially  performed  /  Les  taches 
de  depannage,  au  niveau  d ' installation ,  sont  essentiellemcnt  exccutees) 
"Appendix  A"  "Maintenance  Technical  Parameters  Check  Sheet"  should  be  used. 

La  "feuille  de  verification  de  paramdtres  de  maintenance  d”’A”  d’annexe 
“technique"  devrait  etre  utilises.  (316) 

La  presence  de  propositions  relatives  cn  cascade  a  aussi  dte  incluse,  meme  si 
dons  certains  textes  bien  structurds  elle  ne  semble  pas  nuire  au  resultat,  Ce  n’est 
toutefois  r-ss  la  regie  mais  1 'exception, 

L’utilisation  d'abreviations  et  d'acronymes,  et  plus  particulierement  d’oeronymes 
homograplies,  forme  une  catdgorie  distincte.  Ces  derniers,  plus  facilos  a  prononcer,  sont 
tres  populaires.  La  question  des  acronymes  pourrait  etre  considdree  parmi  les  focteurs 
terrcinologiquos.  Elle  figure  parmi  les  facteurs  stylistiques  en  raison  des  difficultes 
particulieres  que  peut  presenter  le  codage  ou  l’entree  au  dictionnaire  de  ces  termes,  mais 
aussi  parce  qu'elle  donne  lieu  a  des  erreurs  ou  des  irregularitds  d’usage  ou  de  grar..io. 

Shape  delegates  arrived. 

Les  ddleguees  do  forme  sont  arrivees.  (012E) 

The  SCC  panels  offer  status  information. 

Le  SCC  lambrisse  l1 information  d’etat  d’offre.  (17) 


Pour  finir,  la  prdsence  de  tableaux  ou  graphiques,  ou  la  presentation  en 
colonies,  si  elle  n’a  pas  dte  prise  en  compte  au  chapitre  des  considerations  materielles 
et  rdglde  a  ce  stode  par  des  dispositions  idoines,  doit  l’etre  au  moment  de  l’onalyse 
stylistique. 

Les  traits  recensds  ci-dessus  peuvent  etre  frdquents,  occasionncls  ou  rares, 
et  le  corpus  sera  cote  en  consequence.  La  cote  peut  etre  combines,  e'est-d-dire  porter  sur 
l’occurence  des  diverses  carocteristiquos .  La  marge  de  tolerance  sera  alors  plus  large. 
A  titre  indicatif,  on  suggere  que  la  catdgorie  "rare"  soit  rdservde  a  moins  de  trois 
occurences  par  page,  la  catdgorie  "uccasionnelle"  a  trois  &  six  occurences  par  page,  et 
la  catdgorie  "frequente"  d  plus  de  six  occurences  par  page.  La  cote  peut  aussi  etre 
attribuee  d  chacune  des  caracterist.iques  linguistiques ,  et  la  marge  de  tolerance  sera 
ressorrde  en  consequence. 

Cependant,  il  ne  faut  pas  oublier  que,  quel  qu’il  soit,  le  systdme  de  traduction 
utilise,  il  est  raffind  par  l’entree  de  rdgles  et  de  terminologie ,  et  e'est  la  un  effort 
tous  les  jours  renouveld. 

L'analyse  linguistique  du  systdme  sera  done  compldtde  par  une  dvaluation  de  la 
stabilite  du  style  et  de  la  recurrence  des  formes  typiques,  fondde  sur  les  documents  memos, 
ou  induite  de  la  structure  organisationnelle  et  dos  usages  de  1’ institution.  Ain3i, 
certains  organismes  utilisent  des  foroiules  figdes  pour  tous  lours  documents,  ou  leurs 
proeds-vorbaux.  Des  types  de  documents  sont  gdneralement  redigds  par  le  meme  auteur,  ou 
par  le  meme  groupe  d’auteurs,  atteignant  une  coherence  de  style  qu’on  ne  saurait  attendre 
de  rddacteurs  dparpilids  dans  tout  le  pays,  ou  meme  dans  plusieurs  pays. 
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Dans  une  situation  de  ce  genre,  1 ’ integration  sous  une  forme  ou  une  autre  des 
expressions  recurrentes  et  des  tournures  particuliArement  frequentcs  est  possible  et  peut 
donnor  des  rdsultats  interessants . 

On  peut  d’ailleurs  envisager  d’influer  sur  le  style,  en  collaberant  etroitement 
avec  les  auteurs,  sans  pour  autarit  imposer  un  carcan  de  redaction,  et  de  modifier  le  profil 
des  textes.  Une  fois  prise  la  decision  de  se  lancer  dans  la  tao,  une  3dance  d’information 
atttire  1 ’attention  des  redacteurs  sur  certains  types  de  problemes  qul  auront  Atd 
identifies  dans  le  corpus  de  base  et  seront  Avites  par  le  respect  de  certaines  regies  de 
redaction. 

Grille  devaluation 

En  appliquant  ces  trois  facteurs,  on  cree  une  grille  devaluation  qui  permet 
de  classer  Ins  documents  en  sept  categories,  par  ordre  ddcroissant  de  "taoisabilitA" . 

Suivant  les  objectifs  poursuivis,  une  fois  reconnu  le  profil,  une  decision 
organisationneile  pourrait  en  principe  faire  passer  un  corpus  a  la  categorie  superieure, 
en  imposant  par  exeraple  aux  cadres  un  vocabulaire  uniforme  et  un  style  control^.  Encore 
faudrait-il  que  cette  revolution  soit  envisageable  et  ses  effets  assures. 

Textes  tres  taoisables 

Ce  sont  les  textes  eminemment  taoisables,  ideaux  sur  tous  les  plans  et  au  regard 
des  divers  facteurs. 

En  l'occurence,  il  s'agirait  de  textes  entres  directement  sur  l’ordinateur  de 
la  tao,  avec  tous  les  codes  de  formatage  voulus,  Au  stade  actuel,  cela  pourrait  impliquer 
que  le  document  consiste  en  texte  continu  et  ne  comporte  aucune  colonne,  pas  un  seul 
tableau  et  evidemment  pas  de  graphiques. 

La  terminologie  en  serait  limitee  et  stable,  sans  polyseraie,  ce  qui  serait  le 
cas  pour  un  sous-domaine  ou  sous-langage  bien  defini. 

Sur  le  plan  stylistique,  les  textes  ne  presc  .^raient  aucune  des 
caracteristiques  negatives:  texte’  sans  ambiguites,  redigd  vigoureusement  suivant  les  regies 
graramaticalcs,  respeetant  l'usage,  sans  fautes  d’orthographe  ni  coquilles.  Les  phrases  sont 
courtes,  mais  sans  raccourcis,  ni  ellipses.  Si  les  empilages  et  les  cascades  do  mots  en 
sont  bannis,  sont  egalement  exclues  les  constructions  idiomatiques  ou  les  prepositions 
abondent.  La  fantaisie  et  1 'imagination  ne  viennent  pas  perturber  ce  portrait  idyllique. 

Pour  un  corpus  compose  unlqucment  de  textes  de  ce  genre,  satisfaisant  au  entire 
reiatif  a  la  terminologie  et  done  au  sous-langage  bien  delimit^,  la  difficulty  pourrait 
plutot  etre  d'un  autre  ordre,  A  savoir  l’existence  d’un  volume  suffisant. 

Textes  generalement  taoisables 

Co  sont  les  textes  qui  rApondent  de  faqon  generale  A  tous  les  criteres.  Certains 
hearts  en  viennent  compliquer  le  traitement  ou  ralentir  le  processus,  mais  lls  sont  bien 
dAlimites  et  peuvent  etre  corrigAs  par  une  intervention  precise. 

Par  exempie,  il  pourrait  s’agir  de  documents  entres  dons  le  meme  ordinateur, 
A  la  terminologie  tres  limitee,  mais  dont  les  caracteristiques  linguistiquos  negatives  sont 
extremcment  stables  et  peuvent  etre  integrAes  dons  un  systeme  de  too.  C’est  le  cos  des 
bulletins  meteorologiques ,  dont  le  profil  tres  particulier  a  donne  lieu  au  devcloppement 
d'un  systeme  cible,  ce  qu'autorise  le  volume  annuel  considerable. 

Entreraient  aussi  dans  cette  categorie  des  documents  d’un  domaine  tres  limitA, 
au  profil  linguistique  positif  (occurences  rares),  mais  seulcment  disponibles  sur  support 
magnAtique.  La  solution,  soit  1 ’etablissement  ou  le  perfectionnement  de  la  conversion, 
serait  alors  d’ordre  technique. 

Textes  taoisables 

Ce  sont  les  textes  qui  rApondent  A  la  plupart  des  critferes  enumeres.  Les 
ecarts  qui  affectent  les  resultats  ne  se  pretent  pas  une  solution  unique  precise  mais  les 
efforts  de  correction  doivent  se  poursuivre  sur  une  certoine  periode  ou  peuvent  porter  sur 
plusieurs  aspects. 

Par  exempie,  il  s’agirait  de  textes  d’un  domaine  limitA,  et  au  profil 
stylistique  A  la  cote  rare,  dont  les  caracteristiques  negatives  n’ont  pas  une  frequence 
significative.  La  solution  serait  d’ordre  termir.ologique  et  linguistique. 

Seraient  taoisables  aussi  les  textes  sur  support  magnetique  mais  encombres  de 
colonnes,  ou  encore  ceux  sur  papier,  maisdont  les  caracteres  peuvent  etre  lus  par  lecteur 
optique.  La  solution  serait  alors  d'ordre  technique  et  organisationnel . 
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Textes  peut-etre  taoisables 

Ce  sont  les  textes  qui  rdpondent  generalement  a  la  plupart  des  criteres 
enumeres.  Des  ecarts  affectent  les  resultuts  raais  il  pourraient  etre  regies  par  du  travail 
portant  sur  1’ aspect  linguistique . 

XI  pourrait  s’agir  de  textes  dans  un  domaine  a  la  terminologie  limitee,  aux 
occurences  stylistiquos  occasionnelles  mais  recurrontes  et  uniformes.  Par  exemple, 
l’utilisation  d’acronymes  ou  d ’ abrcviations  est  frequente,  mais  coherente,  ou  encore  le 
texte  contient  des  constructions  vcrbales  particulieres,  qui  paraissent  stables. et  peuvent 
faire  l’objet  d'une  regie.  Da  solution  est  d’ordre  linguistique  mais  pcut  etre  couteusc. 

C’est  pour  cette  categorie  que  1 'analyse  linguist! tique  plus  poussde  est 
essentielle,  et  ello  doit  etre  assortie  d'une  evaluation  du  cout  des  ameliorations 
necessaires. 

Textes  di f f icilement  taoisables 

Co  sont  des  textes  qui  ne  repondent  pas  aux  criteres.  Leurs  caracteristiques 
matenelles,  terminologiques  ou  stylistiquos  affectent  le  traitement. 

Ainsi,  serait  diff Icilement  taoisable  un  ensemble  de  manuels  sur  des  sujets 
divers,  en  raison  du  vaste  domaine  ct  de  la  torminologio  variee  et  instable. 

Sur  le  plan  materiel,  des  textes  aux  nombreux  tableaux  et  graphiques 
presenteraient  le  meme  profil  de  difficult^. 

Textes  tres  difficilement  taoisables 

Ce  sont  les  textes  qui  ne  repondent  pas  h  l’un  des  criteres  essentiels.  Leurs 
caracteristiques  terminologiques  et  stylistiques  affectent  considerablement  le  traitement. 

Entrerait  dans  cette  categorie,  par  exemple,  un  corpus  de  jugements  d’un 
tribunal  administratif  fiscal.  Le  domaine  quoiquo  precis  est;  vaste,  et  les  auteurs  sont 
nombreux  avec  toute  latitude  pour  exercor  leurs  prerogatives  rddactionnelles .  Le  facteur 
stylistique  serait  61 iminatoire . 

Textes  non  taoisables 

Ce  sont  les  textes  qui  ne  repondent  pas  aux  criteres  essentiels.  Leurs 
caracteristiques  sont  elirainatoires:  Sur  le  plan  terminologique ,  un  domaine  non  defini, 
sur  le  plan  stylistique,  des  caracteristiques  negatives  frequentes  et  non  conerontos,  et 
sur  le  plan  materiel  une  forme  non  ordinolingue. 

Pour  des  textes  informatifs,  ce  serait  le  ces  de  publications  sur  les  marques 
do  commerce,  dont  la  terminologie  couvre  pratiquement  tous  les  domaines,  sons  aucune 
uniformitu,  et  dont  le  stylo  elliptique  pose  de  considerables  problemes  d’analyse. 

Ce  serait  evidemment  le  cos  aussi  d’articles  disponibles  sur  papier  seulemcnt, 
des  manuscrits  par  exemple,  non  lisiblcs  par  lecteur  optique. 

Conclusion 


L'essai  de  typologie  presentd  ici  n’est  qu’une  dbouche,  qui  reste  a  raffiner 
et  a  preciscr.  L’obstocle  considerable  que  presente  la  multiplicity  des  types  de  textes 
et  de  leurs  caracteristiques  n’a  pas  dte  franchi.  D’autres  analyses  plus  savantes  ont 
traite  du  sujet.  Dans  un  parti  pris  de  simplification,  le  developpement  a  etc  tente  d’un 
outil  qui  serve  de  tamis  au  gestionnaire  a  la  croisee  de  la  technologie. 

Cet  outil  devra  evoluer,  d’obord  pour  s’adaptcr  a  l’evolution  ropide  du  secteur 
informotique  car  l 'on  peut  esperer  que  les  problemes  de  conversion  seront  regies  un  jour. 
La  typologie  pourra  s’enricliir  aussi  sur  le  plan  linguistique  a  mesure  qu’elle  est  utilisce 
sur  des  corpus  differents.  La  quantification  plus  precise  des  occurences  serait  notomment 
precieuse  pour  situer  les  corpus.  Des  etudes  en  cours  sur  les  evaluations  donneront  une 
information  supplementaire .  D’autros  articles  sur  le  traitement  des  longues  naturelles 
fourniront  des  donnees  de  comparaison. 

Cet  outil  done,  imparfait  mais  perfectible,  peut  aider  le  gestionnaire  h 
determiner  dans  un  premier  temps  si  la  traduction  assistee  par  ordinateur  est  une  solution 
envisageable. 

Pour  quelqucs  cas  clairs,  ellc  lui  dvitera  sans  doute  une  analyse  plus  poussee. 
Dans  d'autres  situations,  elle  ne  rsmplacora  pas  une  analyse  des  structures  linguistiqucs 
des  documents,  ou  a  ddfaut  une  evaluation  d’essais  controles  de  too,  ofin  de  ciblor 
1 'application  de  celle-ci. 
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Cette  evaluation  des  facteurs  intrinseques  doit  etre  compldtee,  oela  va  de  soi, 
par  une  analyse  rigourcuse  de  tous  les  facteurs  oxternes,  dont  le  oontexte  et  les 
imperatifs  organisationnels ,  le  cout  d’ investissement  et  d’exploitation,  et  surtout  les 
objectifs  poursuivis. 

La  traduction  automatlque  ou  assist4e  par  ordinateur  offre  une  solution  tentante 
aux  problemes  que  peut  poser  la  gestion  de  la  charge  de  travail  en  traduction.  Avant  de 
ceder  a  la  tentation,  il  est  sage  de  mcsurer  ses  chances  de  succhs. 
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SUMMARY 


The  purpose  of  this  paper  is  to  give  a  brief  overview  of  the  problems  evidenced  during  studies  into  the  use  of 
Computer  Assisted  Translation  (CAT)  in  the  AEROSPATIALE  AIRCRAFT  DIVISION  technical  publications 
production  environment.  The  aim  is  not  to  review  the  capabilities  of  the  various  CAT  systems  available  in  a 
comparative  study  but  rather  to  highlight  the  technical,  economical  and  psychological  problems  that  have  to 
date  precluded  the  integration  of  CAT  in  this  very  specific  industrial  context. 


1.  INTRODUCTION 
A.  GENERAL 

In  1989,  for  the  first  time,  AEROSPATIALE,  with  their  European  partners,  logged  more  than  30%  of  world 
orders  for  civil  aircraft  in  their  sector  of  the  market  thus  becoming  the  second  largest  aircraft 
manufacturer  in  their  category. 

In  that  year  there  was  a  total  fleet  of  721  AIRBUS  and  ATR  aircraft  in  service  with  142  customers. 

By  the  year  1995,  in  just  5  years  time,  this  fleet  will  have  grown  to  around  2500  aircraft  in  regular  service 
around  the  world  with  some  300  customers. 

Such  rapid  expansion  implies  a  constant  search  for  increased  productivity  and  efficiency  to  achieve 
reduced  production  costs  and  cycles  while  continuously  enhancing  the  quality  of  service  provided  to  the 
customers. 

The  AEROSPATIALE  AIRCRAFT  DIVISION,  fully  aware  of  the  commercial  stakes  involved,  is  organizing  to 
meet  this  challenge  and  has  adopted  a  strategy  largely  based  on  ; 

•  an  ambitious  training  program  aimed  at  adapting  the  personnel  to  the  new  requirements  of  the  rapidly 
evolving  industrial  context. 

-  the  development  of  advanced  production  means  taking  maximum  advantage  of  the  possibilities  offered 
by  computerization. 

It  is  in  this  highly  dynamic  context  characterized  by  a  constant  search  for  new  means  of  pushing  back  the 
limits  of  productivity  and  efficiency  that  the  Technical  Publications  Department  has  conducted  numerous 
studies  into  the  possible  utilization  of  Machine  Translation  (MT)  or  Computer  Assisted  Translation  (CAT). 


B.  HISTORY  OF  TRANSLATION  IN  THE  AEROSPATIALE  AIRCRAFT  DIVISION 

Translation  in  the  Aircraft  Division  has  undergone  and  is  still  undergoing  considerable  change  to  adapt  to 
the  requirements  of  a  continuously  evolving  environment. 

It  was  with  the  success  of  CAR  AVELLE  in  the  fifties  that  the  need  for  translation  led  to  the  setting  up  of 
small  groups  of  translators  within  various  departments :  Design  Office,  Production,  Quality  Assurance, 
Flight  Test  and,  of  course,  Product  Support  in  direct  contact  with  the  customer. 

These^small  groups  grew  in  size  with  the  first  experience  of  European  cooperation  :  CONCORDE. 

At  that  time  only  a  very  small  percentage  of  AEROSPATIALE  personnel  was  capable  of  getting  along  in 
English.  It  was  therefore  necessary  to  translate  all  the  documents  transiting  between  BRITISH  AEROSPACE 
and  AEROSPATIALE.  The  presence  of  an  interpreter  was  indispensable  at  all  the  working  meetings 
between  the  two  partners. 

The  CONCORDE  technical  publications  were  initially  issued  in  the  two  official  languages  of  the  program, 
English  and  French,  before  the  French  version  was  abandoned  for  cost  reasons. 

When  the  AIRBUS  program  was  launched,  AEROSPATIALE,  rich  with  the  experience  gained  with 
CONCORDE,  strove  to  eliminate  the  problems  arising  from  the  use  of  two  (or  more)  languages. 

At  that  time  only  a  very  small  percentage  of  AEROSPATIALE  personnel  was  capable  of  getting  along  in 
English.  It  was  therefore  necessary  to  translate  all  the  documents  transiting  between  BRITISH  AEROSPACE 
and  AEROSPATIALE.  The  presence  of  an  interpreter  was  indispensable  at  all  the  working  meetings 
between  the  two  partners. 


The  CONCORDE  technical  publications  were  initially  issued  in  the  two  official  languages  of  the  program, 
English  and  French,  before  the  French  version  was  abandoned  for  cost  reasons. 

When  the  AIR8US  program  was  launched,  AEROSPATIALE,  rich  with  the  experience  gained  with 
CONCORDE,  strove  to  eliminate  the  problems  arising  from  the  use  of  two  (or  more)  languages. 

English  was  adopted  as  the  official  language  of  the  AIRBUS  program.  All  correspondence  between  the 
partners  had  to  be  in  the  English  language.  AEROSPATIALE  therefore  launched  a  vast  training  program  so 
that  the  personnel  concerned  acquired  a  level  of  English  sufficient : 

-  to  understand  routine  correspondence  received  in  English, 

-  to  write  directly  in  English  simple  memos  addressed  to  the  partners, 

-  to  get  along  in  inter-partner  meetings  conducted  in  English  without  the  assistance  of  an  interpreter. 

Thus,  in  many  sectors,  the  use  of  English  as  the  official  language  for  the  AIRBUS  and,  later  on,  the  ATR 
programs  resulted  in  a  reduction  in  workload  for  the  translators  and  the  gradual  dissolution  of  the 
translation  offices. 

The  Product  Support  translation  office,  however,  largely  due  to  the  volume  of  translation  involved  in  the 
production  of  technical  publications  continued  to  grow  and  has  now  become  the  largest  single  group  of 
translators  in  the  Aircraft  Division. 


The  basic  issue  of  the  contractual  technical  publications  for  one  aircraft  represents  some  39  manuals  and 
approximately  800,000  printed  pages. 

The  major  manuals  are  customized  either  to  the  airline  fleet  or  to  the  aircraft  and  a  revision  service  keeps 
them  up  to  date. 

With  the  multiplication  of  aircraft  types  produced,  the  sharp  rise  in  aircraft  sales  and  the  rapid  expansion 
of  the  in-service  fleet,  the  quantity  of  technical  publications  shipped  each  year  is  constantly  increasing. 
Over  the  last  ten  years  the  volume  of  technical  publications  shipped  yearly  has  increased  from  18  million 
pages  in  1979  to  69  million  pages  in  1989. 

The  technical  publications,  whether  AIRBUS  or  ATR,  are  produced  in  cooperation  by  the  various  partners 
on  the  basis  of  an  industrial  worksharing  defined  by  the  GIE.  In  both  cases  overall  leadership  for  the 
technical  publications  has  been  awarded  to  AEROSPATIALE.  As  leader  partner  AEROSPATIALE  is 
responsible  for  developing  all  the  EDP  (Electronic  Data  Processing)  facilities  required  for  the  production  of 
technical  publications. 

Considerable  investments  have  been  devoted  to  the  research  and  development  of  high-performance 
software  capable  of  coping  with  the  ever-increasing  volume  of  data  to  be  processed  and  quantities  of 
publications  to  be  produced. 

For  the  A320,  the  EDP  systems  used  for  the  management,  acquisition  and  finalization  of  the  technical  data 
have  been  totally  redesigned  to  comply  with  the  new  requirements  of  ATA  specification  100,  which 
establishes  rules  for  the  presentation  of  the  data,  and  to  achieve  greater  flexibility  in  the  production 
cycles. 

The  A320  is  in  fact  the  first  aircraft  in  the  world  for  which  the  Aircraft  Maintenance  Manual  integrates  the 
requirements  of  the  ATA  100  AMTOSS  concept  (Aircraft  Maintenance  Task  Oriented  Support  System) 
designed  to  improve  the  organization  of  the  Maintenance  Manual  and  to  facilitate  automated  data 
retrieval.  An  open-ended  system  designated  GIPSY  (General  Integrated  Publications  System)  has  been 
specifically  developed  to  meet  these  requirements.  With  GIPSY  the  technical  authors,  assisted  by 
numerous  built-in  aids,  update  the  data  files  in  real-time  and  it  is  possible  to  obtain  customized  outputs  of 
the  manual  to  the  latest  technical  status  as  and  when  required.  In  developing  these  new  systems,  the 
Technical  Publications  Department  has  acquired  high  potential  for  innovation  and  participates  actively  in 
a  wide  range  of  projects  aimed  at  improving  existing  Product  Support  services  or  creating  new  ones : 

-  Technical  publications  on  optical  disk  (ADRES) 

-  Computer-Assisted  AircraftTrouble  Shooting  (CAATS) 

-  Order  Processing  Automated  on-Line  (OPAL) 

-  Technical  publications  stock  and  shipping  management  softv/are  system  (APASHE) 

-  Onboard  Electronic  Library  System  (ELS) 

-  Maintenance  Information  Planning  System  (MIPS) 

-  On-line  interrogation  by  the  airlines  of  the  manufacturer  data  banks 

-  etc.... 

This  constant  search  for  innovative  methods  of  increasing  productivity  and  enhancing  the  quality  of 
service  provided  to  the  customers  while  reducing  costs  and  production  cycles  has  not  neglected  machine 
translation.  At  present,  the  Technical  Publications  Department  disposes  of  a  group  of  1 1  full-time 
translators  backed  up  by  an  equivalent  number  working  as  subcontractors.  This  group  is  responsible  for  all 
technical  publications  translation-related  activities  as  well  as  the  translation  of  vario"'  -uments  such  as 
correspondence,  technical  reports,  specifications,  presentations,  brochures,  press  articles;  contracts,  etc... 
issued  or  received  by  Product  Support  and  a  wide  range  of  other  departments  within  the  Aircraft  Division. 

The  possibility  of  using  MT  or  CAT  has  therefore  aroused  wide-spread  interest  and  numerous  in-depth 
studies  have  been  conducted  to  investigate  the  feasibility  of  integrating  MT  or  CAT  in  this  very  specific 
environment.  These  studies  have  evidenced  a  certain  number  of  problems  that  have  to  date  rendered  this 
integration  impossible  for  technical  and/or  economic  reasons. 
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2.  PRACTICAL  PROBLEMS  ASSOCIATED  WITH  THE  UTILIZATION  OF  MACHINE  TRANSLATION 

The  utilization  of  MT/CAT  inevitably  raises  a  certain  number  of  problems  irrespective  of  whether  the  system 
adopted  is  a  small  system  operating  in  a  PC  environment  or  a  large  system  operating  on  a  central  computer  or 
accessible^  a  subscription  basis  via  an  external  network.  These  problems  are  mainly  technical  and  economic 
but  certain  human  and  psychological  problems,  although  of  lesser  importance  in  the  decision  process,  should 
not  be  totally  neglected.  1  wo  criteria  generally  determine  the  cost  effectiveness  of  MT/CAT : 

-  the  volume  to  be  translated, 

-  the  extent  of  human  preparation/correction  required  to  obtain  a  satisfactory  result. 

It  is  evident  that  the  utilization  of  MT/CAT  can  only  be  envisaged  if  the  volume  of  translation  involved  justifies 
the  investments.  Furthermore,  any  time  spent  by  human  translators  in  preparing  source  documents  for 
MT/CAT  (pre-editing)  or  correcting  MT/CAT  outputs  to  achieve  the  required  result  (post-editing)  reduces  the 
cost-saving  capacity  of  the  system. 

In  addition  to  these  "universal"  criteria,  there  are  of  course  other  more  specific  criteria  that  need  to  be  taken 
into  consideration  such  as  the  technical  and  economic  aspects  of  integrating  MT/CAT  in  a  given  EDP 
environment. 

The  decision  as  to  whether  or  not  to  go  MT/CAT  is,  therefore,  based  essentially  on  a  comparative  study 
between  the  constraints  and  problems  inherent  in  the  integration  and  utilization  of  the  system  and  tne 
estimated  savings  in  terms  of  translation  costs,  leadtimes  and  personnel. 

To  date,  the  utilization  of  MT/CAT,  in  the  very  specific  translation  environment  existing  in  the  AEROSPATIALE 
AIRCRAFT  DIVISION  Product  Support  organization,  generates  a  rather  exceptional  accumulation  of 
constraints  and  problems  that  preclude  a  rational  and  cost-effective  integration  of  the  systems  currently 
available  on  the  market.  Indeed  only  a  very  small  percentage  of  the  aircraft  technical  publications  or  the  far 
from  negligible  volume  of  various  other  documents  transiting  through  the  translation  office  are  suitable  for 
MT/CAT. 

A.  TECHNICAL  AND  ECONOMIC  PROBLEMS  WITH  MT/CAT 
(1)  Aircraft  Technical  Publications 
(a)  Volumes  of  data 

The  studies  carried  out  to  date  have  shown  that,  despite  the  impressive  quantity  of  pages  they 
represent,  the  aircraft  technical  publications  contain  relatively  little  text  compatible  with  MT/CAT. 

As  has  already  been  mentioned,  the  technical  publications  package  for  one  aircraft  comprises 
some  39  manuals. 

Each  of  these  manuals  has  been  developed  for  a  specific  utilization  in  a  specific  context  and 
complies  with  very  strict  rules  imposed  by  industry  standards  such  as  the  ATA  specifications. 

The  content  of  certain  manuals  such  as  the  Aircraft  Wiring  Manual  and  Aircraft  Schematic  Manual 
is  essentially  graphic. 

In  other  manuals,  the  content  is  a  combination  of  contracted  text  and  complex  layout. 

In  the  Aircraft  Trouble  Shooting  Manual  the  text  is  presented  in  the  form  of  diagrams. 

In  the  operational  manuals,  the  text,  generally  written  directly  in  the  end-user  language  by 
specialized  personnel,  is  characterized  by  highly-integrated  graphics. 

The  Illustrated  Parts  Catalog,  which  is  basically  a  series  of  illustrations  with  associated 
nomenclature,  contains  little  text  suitable  for  MT/CAT. 

The  conclusion  of  a  detailed  review  of  all  the  various  manuals  is  that  the  cost-effective  utilization 
of  MT/CAT  in  the  technical  publications  context  is  largely  dependent  on  its  utilization  for  the 
Aircraft  Maintenance  Manual  (AMM). 

Indeed  the  AMM  is  one  of  the  rare  manuals  to  contain  text  in  sufficient  volume  and  in  a  form  that 
enables  the  cost-effective  utilization  of  MT/CAT  to  be  envisaged. 

Th  divided  into  two  parts: 

.w.  etical  part  describing  the  various  systems  of  the  aircraft  and  their  operation 

-  a  practical  part  detailing  the  various  procedures  required  for  the  maintenance  of  the  aircraft. 

It  is  during  the  preparation  of  the  basic  manual  for  a  given  type  of  aircraft,  that,  due  to  the 
considerable  volume  of  text  to  be  translated  in  a  relatively  snort  period,  the  utilization  of  MT/CAT 
is  the  most  attractive.  As  an  example,  (he  A320  AMM  for  a  given  customer  contains  around 
28,000  pages  ( a  1 7,000.text  and  1 1 ,000  illustrations)  and  tne  total  data  bank  around  40,000  pages 
(=■  25,000  text  and  15,000  illustrations). 


When  the  basic  content  has  been  issued,  the  manual  enters  the  revision  phase  during  which  it  is 
updated  at  regular  intervals  (generally  quarterly)  to  integrate  changes  relative  to  modifications 
embodied  on  the  aircraft,  variants  specific  to  customized  configurations  or  the  correction  of 
possible  errors.  During  the  first  years  in  the  life  of  a  manual  approximately  20  %  of  the  pages  are 
revised  at  each  revision  although  the  percentage  of  text  actually  new  or  modified  is  very  much 
lower  than  this  figure. 

However,  although  these  figures  are  quite  impressive,  it  is  important  to  note  that,  in  today's 
context  of  European  cooperation,  the  AMM  is  produced  on  an  inter-partner  worksharing  basis 
and  that  this  has  a  significant  impact  on  the  volume  of  translation. 

In  the  case  of  the  ATR,  the  AMM  is  officially  issued  in  English  and  French.  AEROSPATIALE  is 
responsible  for  translating  its  contribution  from  French  to  English  and  the  AERITAUA 
contribution  from  English  to  French. 

As  far  as  AIRBUS  is  concerned,  the  AMM  is  officially  issued  in  English  only.  However,  for  the  A320, 
AEROSPATIALE  has  separate  contracts  for  the  translation  of  the  AMM  from  English  to  French. 

For  the  official  version  in  English,  the  other  AIRBUS  partners  write  their  contributions  directly  in 
English.  As  a  general  rule,  the  AEROSPATIALE  authors  responsible  for  approximately  70  %  of  the 
manual,  write  in  French  during  the  initial  production  phase  when  time  is  short  and  the  texts  are 
long.  However,  once  the  manual  enters  the  revision  phase,  the  same  authors  tend  to  write 
modifications  to  existing  texts  or  variants  derived  from  existing  texts  directly  in  English.  The 
French  version  of  the  manual  is  produced  by  retrieval  of  the  texts  written  directly  in  French  and 
translation  of  the  partner  contributions  as  well  as  the  texts  written  directly  in  English  by  the 
AEROSPATIALE  authors. 

The  quantity  of  translation  involved  in  the  production  of  the  AMM  is,  therefore,  less  than  the 
total  volume  of  the  manual  would  initially  seem  to  indicate  but  remains  sufficient  to  warrant  an 
investigation  into  the  utilization  of  MT/CAT. 

(b)  Integration  of  MT/CAT  in  the  Technical  Publications  Production  Process 

The  volume  of  translation  involved  in  the  production  of  the  AMM  having  been  judged  sufficient 
to  justify  the  utilization  of  MT/CAT,  the  next  step  is  to  study  its  integration  in  the  production 
process. 

One  of  the  factors  influencing  the  cost  effectiveness  of  MT/CAT  is  whether  or  not  the  source  text 
is  available  in  a  form  that  can  be  fed  directly  into  the  system. 

At  first  sight,  the  ATR  Maintenance  Manual  production  environment  seems  particularly  well 
adapted  to  MT/CAT  in  this  respect. 

Texts  written  in  English  are  acquired  in  one  file  and  those  written  in  French  are  acquired  in 
another,  it  would  be  relatively  simple  to  integrate  the  MT/CAT  between  the  two  files  to  translate 
the  source  text  (whether  English  or  French)  into  the  other  language. 

Unfortunately,  however,  for  EDP  reasons  the  ATR  Maintenance  Manual  has  the  particularity  of 
being  acquired  entirely  in  upper  case.  Tests  performed  on  representative  samples  of  the  manual 
have  shown  that,  due  to  the  absence  of  lower  case  letters  and  more  especially  the  accentuation, 
the  results  obtained  with  MT/CAT  when  translating  from  French  to  English  are  totally 
unacceptable. 

In  the  case  of  AIRBUS  and  more  particularly  the  A320,  A330;  A340  and  future  programs,  the 
problems  are  different. 

To  comply  with  the  latest  requirements  of  the  ATA  100  relative  to  the  AMTOSS  concept,  to  enable 
a  more  rational  utilization  of  EDP  data  management  systems  and  to  prepare  the  way  for  the  new 
Technical  Publications  media,  AEROSPATIALE  has  totally  redesigned  the  Technical  Publications 
production  systems. 

For  example,  the  ATA  100  AMTOSS  concept  requires  that  the  manufacturer  provide  the  airlines 
with  a  PMDB  (Production  Management  Data  Base). This  bank  contains  data  related  to  the 
planning  and  organization  of  maintenance  extracted  directly  from  the  text  of  the  Aircraft 
Maintenance  Manual,  The  production  of  this  data  implies  the  integration  in  the  text  during 
acquisition  of  codes  (tags)  identifying  the  data  for  subsequent  extraction.  With  the  newTechnical 
Publications  media  (such  as  optical  disk)  now  being  developed,  these  tags  are  also  used  to 
establish  intra-manual  and  inter-manual  links.  It  is  important  not  only  to  establish  links  between 
data  but  to  ensure  consistency  of  data  within  a  given  manual,  between  manuals  and  with  the 
placards  on  the  aircraft  and  equipment.  It  was  decided  that,  to  avoid  duplicating  the  acquisition 
of  data  with  the  risk  of  error  this  represents;  wherever  possible  the  data  would  be  extracted 
directly  from  the  data  source  file  and  automatically  integrated  in  the  text. 

Another  objective  that  largely  influenced  the  design  of  the  new  production  systems  was  that  they 
should  be  capable  of  immediately  outputting  a  customized  manual  fully  updated  with  the  latest 
known  data  without  being  subordinated  to  rigid  revision  cycles.  To  achieve  this  objective  the 
systems  were  designed  to  enable  the  authors,  whether  in  France,  Germany,  Great  Britain  or  Spain 
to  acquire  their  data  on-line  via  terminals  connected  to  the  central  data  bank  in  the 
AEROSPATIALE  TOULOUSE  facilities. 
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In  view  of  all  these  ano’  other  requirements,  the  new  data  acquisition  system,  designated  GIPSY 
(General  Integrated  Publiccitions  System),  was  organized  around  file  management  software 
facilitating  the  transfer  of  data  between  files  rather  than  conventional  word  processing. 

In  fact  the  system  manages  a  certain  number  of  different  but  interconnected  files,  each 
containing  elements  of  the  manual.  The  Maintenance  Manual  as  such  is  the  result  of  a  finalization 
process  which  consists  in  extracting  data  from  these  various  files,  compiling  the  extracted  data 
and  presenting  it  in  a  form  adapted  to  its  future  utilization  by  the  end-user.  This  totally  new 
Technical  Publications  production  concept,  based  on  the  use  of  file  management  rather  than 
word  processing  softwar involves  a  very  specific  on-line  data  acquisition  process.  This  process 
requires  the  use  of  13.2  character  screens  and  this,  together  with  the  presence  in  the  text  of 
numerous  tags,  considerably  complicates  the  integration  of  MT/CAT. 

English  Version  of  the  Ai rcraft  Maintenance  Manual 

The  possible  integration  of  MT/CAT  only  concerns  the  AEROSPATIALE  contribution 
(approximately  70  %  of  the  AMM)  and  more  precisely  that  part  of  the  AEROSPATIALE 
contribution  written  in  French.  It  is  evident  that  the  integration  of  MT/CAT  In  GIPSY  must  not 
have  an  adverse  effect  on  the  performance  of  the  system  as  a  whole  to  the  detriment  of  the  other 
partners  and  those  French  authors  writing  directly  in  English.  It  should  be  noted  that,  as  the 
official  language  is  English,  GlPSY  imposes  En""sh  as  the  source  language  and  numerous  aids 
(such  as  the  automatic  generation  of  standard  ntences  and  automatic  call-up  of  technical 
designations)  have  been  built  into  the  system  tc  prompt  the  authors  to  write  directly  in  the 
system  in  English.  T  fie  acquisition  of  English  as  the  GIPSY  source  language  requires  that  the 
French-to-English  translation  phase  be  located  upstream  and  independent  of  GIPSY.  This  implies  a 
preliminary  acquisition  of  a  pre-edited  French  text  for  the  MT/CAT  process  and  a  second 
acquisition  unde/  GIPSY  of  the  translated  text  after  post-editing  to  correct  the  translation  and  to 
restore  the  coding  specific  to  GIPSY,  etc, 

This  process  considerably  retards  the  availability  of  data  and  is  contrary  to  the  on-line  update 
principle.  Furthermore,  it  >s  difficult  to  clearly  define  responsibilities  for  the  various  phases  of  the 
process. 

Initial  write-up  and  integration  of  tags  to  identify  significant  data  are  the  responsibility  of  the 
authors  whereas  the  pre-editing  and  especially  the  post-editing  fall  rather  under  the 
responsibility  of  the  translator. 

The  utilization  of  MT/CAT  in  conjunction  with  GIPSY  for  the  production  of  the  English  version  of 
the  AMM  therefore  imposes  serious  drawbacks  which  are  contrary  to  the  basic  philosophy  of 
GIPSY. 

The  complete  process  which  involves  acquiring  the  data  twice,  is  lengthy,  complicated  and  not  at 
all  cost-effective. 

This  is  particularly  the  case  when  the  manual  is  in  the  revision  phase  during  which  most  of  the 
work  involved  in  updating  the  manual  is  such  that  the  French  authors  prefer  to  write  directly  in 
English.  Finally,  one  other  factor  complicates  the  use  of  MT/CAT  for  the  production  of  the  English 
version  of  the  manual.  The  ATA  100  now  requires  that  the  AMM  be  written  in  Simplified  English. 
Simplified  English  is  a  controlled  lang  uage  specifically  developed  by  AECMA  (Association 
Europ£enne  des  Constructeurs  de  Materiel  M rospatial)  and  the  AIA  (Aerospace  Industries 
Association  of  America  Inc.)  for  aircraft  maintenance  documentation.  Simplified  English  consists 
of  a  limited  vocabulary  and  a  set  of  writing  rules  for  using  that  vocabulary.  The  limited  vocabulary 
(approximately  800  words)  includes,  verbs,  prepositions,  conjunctions,  adjectives,  adverbs  and 
nouns.  In  this  limited  vocabulary,  a  family  of  synonyms  is  represented  by  only  one  of  its  members. 
For  example  "start"  is  used  instead  of  "begin,  commence,  initiate  or  originate".  Also,  as  a  general 
rule,  the  words  in  this  limited  vocabulary  have  only  one  meaning.  For  example  "fall"  is  used  to 
indicate  the  idea  of  gravity  and  not  the  idea  of  decrease  in  quantity.  Finally  the  words  in  the 
limited  vocabulary  can  only  be  used  as  the  part  of  speech  indicatecl  in  the  dictionary.  For  example 
"check"  can  be  used  as  a  noun  but  not  as  a  verb. 

As  for  the  writing  rules,  they  have  been  developed  to  make  the  written  message  easier  to 
understand  by  users  of  the  manual  whose  first  language  is  not  English.  Sentence  length  is  limited 
to  20  words,  verbs  are  used  in  three  tenses  only  (present,  past  and  simple  future),  noun  clusters 
are  broken  down,  the  passive  voice  is  to  be  avoided,  articles  must  be  used,  etc.... 

Although  Simplified  English  is  easier  to  understand  for  the  users  of  the  manual  for  the 
Manufacturer  it  constitutes  a  new  constraint.  To  comply  with  the  requirements  of  Simplified 
English  it  is  often  necessary  not  only  to  replace  unapproved  words  by  approved  ones  but  to 
completely  reformulate  the  initial  idea.  For  example,  "switch"  being  an  approved.verb,  "switch 
on  NAVI”  has  to  be  reformulated  to  something  like  "set  the  NAVI  switch  to  the  ON  position". 
This  reformulation  is  fairly  simple  for  the  authors  writing  directly  in  English.  However,  for  the 
translator,  the  reformulation  may  require  additional  information  not  contained  in  the  initial 
sentence.  For  example,  "Action  on  the  ENG/FIRE  pushbutton  switch  arms  the  squibs"  cannot  be 
transformed  into  good  Simplified  English  without  specifying  what  "action"  is  required  (push  or 
release  ?).  Whereas  the  human  translator  can  cope  with  such  problems,  if  necessary  by  contacting 
the  author,  this  is  not  the  case  of  MT/CAT. 

The  requirement  to  write  in  Simplified  English  is  therefore  another  argument  against  the  use  of 
MT/CAT  for  the  production  of  the  English  version  of  the  AMM. 


French  Version  of  the  Aircraft  Maintenance  Manual 


The  French  version  of  the  manual  is  produced  by  extraction  and  translation  of  the  applicable  texts 
existing  in  English  in  the  GIPSY  files. 

The  translation  of  the  AEROSPATIALE  contribution  is  facilitated  by  retrieval,  where  available,  of 
the  drafts  written  in  French. 

The  ideal  situation  would  be  to  transfer  the  contents  of  the  English  files  to  the  French  files  via  an 
integrated  MT/CAT  stage. 

Here  again,  however,  the  specific  formats  of  the  GIPSY  files  raise  problems  of  compatibility  with 
existing  MT/CAT  systems.  The  development  of  programs  capable  of  automatically  converting  the 
contents  of  the  GIPSY  files  to  and  from  an  MT/CAT  compatible  format  was  envisaged  but  had  to 
be  abandoned  for  cost  reasons  (cost  of  program  development  and  subsequent  processing  time). 

Due  to  these  interfacing  problems  and  as  for  the  English  version  of  the  manual,  the  only 
possibility  is  to  keep  the  MT/CAT  system  independent  of  GIPSY.  This  implies  obtaining  outputs  of 
the  English  texts,  pre-editing  and  acquiring  them  for  MT/CAT,  post-editing  the  results  and 
acquiring  them  in  GIPSY  format. 

This  is  obviously  a  lengthy  process  which  cancels  out  the  advantages  of  using  MT/CAT. 

Furthermore,  GIPSY  offers  the  translator  working  directly  in  the  system  aids  similar  to  those 
available  to  the  authors  (automatic  generation  of  standard  sentences,  technical  designations, 
etc...). 

To  date,  therefore,  the  constraints  imposed  by  the  large  technical  publications  production 
systems  on  the  one  hand  and  the  MT/CAT  systems  on  the  other  are  such  that  it  is  difficult  to 
envisage  a  rational  and  economic  integration  of  the  two  with  current  technologies. 

The  use  of  MT/CAT  has  also  been  investigated  for  the  production  of  another  technical  document 
called  Service  Bulletin  (SB).  An  SB  is  a  self-contained  document  describing  a  modification  and 
containing  instructions  for  embodying  the  modification  on  the  aircraft  or  an  item  of  equipment. 
An  SB  generally  comprises  between  5  and  200  pages  and  annual  production  is  currently  around 
13,000  pages. 

Here  again,  the  specific  nature  of  these  documents,  the  possibility  of  retrieving  standard 
sentences,  the  obligation  to  strictly  comply  with  technical  designations  and  the  requirement  to 
write  in  Simplified  English  cancel  out  by  the  pee-  and  post-editing  involved  any  advantages  that 
could  result  from  the  utilization  of  MT/CAT. 

Indeed  this  is  confirmed  by  the  fact  that  one  of  the  AEROSPATIALE  translation  subcontractors, 
who  is  also  the  agent  in  France  for  a  well-known  CAT  system,  finds  it  more  economical  to 
translate  SBs  "manually".  Furthermore,  the  Airbus  SBs  will  soon  be  produced  using  GIPSY. 


(2)  Technical  Problems  Associated  with  the  Translation  of  Miscellaneous  Documents 

The  Technical  Publications  Dept.  Translation  office  is  not  only  responsible  for  the  translation  of  the 
manuals  but  also  of  a  large  quantity  of  miscellaneous  documents  for  the  Product  Support  and  other 
Aircraft  Division  Directorates.  These  documents  can  be  divided  into  two  categories : 

-  outgoing  documents,  generally  written  in  French  and  translated  into  English, 

-  incoming  documents,  generally  received  in  English  and  translated  into  French  for  internal 
utilization. 

These  documents  represent  the  translation  of  some  10,000  pages  annually  which  is  sufficient  to 
envisage  the  utilization  of  MT/CAT. 

Outgoing  Documents 

The  source  texts  generally  arrive  in  the  translation  office  in  the  form  of  hand-written  rough  drafts. 
When  the  author  is  distant  from  the  translation  office  these  drafts  are  often  sent  by  fax. 

It  is  also  worth  mentioning  that  these  drafts  are  written  by  authors  who  know  that  their  texts  are 
going  to  be  translated  and  if  neces>ary  re-organized  into  a  logical  presentation,  completed  or 
corrected. 

The  utilization  of  MT/CAT  requires  firstly  that  the  source  texts  be  acquired  into  the  system  and 
secondly  that  the  source  texts  be  of  a  quality  compatible  with  a  satisfactory  result.  MT/CAT  does  not 
escape  the  rule  applicable  to  all  computer  systems :  “garbage  in,  garbage  out".  Investigation  into  the 
outgoing  documents  has  shown  that,  in  the  majority  of  cases,  the  time  required  to  prepare  the  texts 
for  MT/CAT  and  to  post-edit  the  results  often  exceeds  that  necessary  for  a  human  translation. 

Furthermore,  an  ever-increasing  number  of  these  documents  and  especially  correspondence 
addressed  to  the  partners  (memos,  meeting  reports, -etc...)  is  being  written  directly  in  English  with  the 
assistance  of  or  a  quick  check  by  the  translators  when  deemed  necessary. 

The  translation  of  outgoing  documents  is  tending  to  be  limited  to  those  texts  requiring  special 
attention  (contracts,  specifications,  technical  reports,  press  articles,  brochures,  etc ...)  and  which  would 
require  careful  post-editing  if  MT/CAT  were  used. 


Incoming  documents 

The  majority  of  incoming  documents  arriving  in  English  are  used  directly  as  such  without  being 
translated  into  French.  The  role  of  the  translator  is  often  limited  to  providing  verbal  confirmation  of 
the  correct  comprehension  of  certain  specific  points.  There  is,  however,  a  relatively  small  number  of 
documents  for  which  a  written  translation  is  required. 

To  enable  the  utilization  of  MT/CAT  for  the  translation  of  these  documents  they  must  first  be 
transferred  to  an  EDP  media.  The  simplest  and  quickest  way  of  doing  this  is  to  use  a  scanner  but  this  is 
only  possible  if  the  document  is  of  sufficient  quality :  typed,  no  stamps,  no  annotations,  no  folds  or 
marks  from  photocopy  machine,  etc  ... 

Unfortunately  such  quality  is  rare  and,  therefore,  most  of  these  documents  would  have  to  be  re-typed 
into  the  MT/CAT  system.  In  most  cases  a  certain  amount  of  pre-editing  would  be  required  and 
glossaries  would  have  to  be  updated.  The  post-editing,  however,  could  be  la  carte" ;  in  some  cases 
the  draft  translation  with  little  or  no  post-editing  would  be  sufficient,  mother  cases  fine  post-editing 
would  be  required  to  achieve  the  desired  standard. 

As  the  majority  of  incoming  documents  are  fairly  well  adapted  to  MT/CAT,  the  justification  of  an 
MT/CAT  system  would  depend  essentially  on  the  volume  of  translation  concerned  and  a  reduction  in 
translation  lead.times.  The  current  price  of  subcontracted  translation  outside  PARIS  and  notably  in  the 
TOULOUSE  area  is  close  to  if  not  lower  than  the  price  of  MT/CAT  translation.  The  cost-effectiveness  of 
MT/CAT  in  this  context  is  therefore  largely  dependent  on  there  being  a  requirement  for  a  large 
quantity  of  translation  with  a  minimum  o'  nost-editing. 

These  conditions  cannot  be  met  without  increasing  the  number  of  potential  users  by  becoming  a 
central  server  for  the  whole  of  the  Aircraft  Division  or  even  the  Company. 


B.  HUMAN  AND  PSYCHOLOGICAL  PROBLEMS 

The  arrival  in  a  department  of  new  tools  and  methods  always  gives  rise  to  a  certain  apprehension 
especially  when  computerization  is  involved.  This  apprehension  is  rapidly  dissipated  if  the  new  tools  and 
methods  prove  efficient  and  improve  working  conditions. 

In  the  case  of  MT/CAT,  the  apprehension  of  the  translators  is  often  amplified  by  the  more  or  less  justified 
impression  that  the  system  constitutes  a  direct  rival  rather  than  just  another  tool  at  their  disposal.  Indeed, 
at  the  very  idea  of  MT/CAT,  the  majority  of  translators  see  themselves  being  subordinated  to  the  machine, 
deprived  of  the  creative  aspects  of  their  work  and  reduced  to  trivial  tasks  peripheral  to  translation  proper, 
such  as  pre-  and  post-editing,  updating  glossaries,  etc ... 

These  fears,  although  often  exagerated,  are  not  always  totally  unfounded. 

It  cannot  be  denied  that,  in  the  minds  of  many,  translators  represent  a  source  of  extra  costs  and  MT/CAT  a 
means  of  limiting  them.  It  is  obvious,  therefore,  that  the  acquisition  of  an  MT/CaT  system  must  result  in 
significant  savings  in  terms  of  translation  costs  and  lead  times,  either  by  a  reduction  in  the  number  of 
translators  or  by  an  increase  in  productivity. 

The  utilization  of  MT/CAT  can,  therefore,  place  the  translator  in  an  ambiguous  situation  where  the  notions 
of  reduced  costs  and  lead  times  conflict  with  that  of  "quality”. 

This  situation  :s  of  course  mainly  related  to  the  post-editing  of  the  draft  translations  output  by  the 
machine.  Translation,  even  when  technical,  can  be  highly  subjective  and  this  subjectivity  can  give  rise  to  a 
certain  frustration  during  the  correction  of  translations  produced  by  someone  else ;  especially  if  this 
someone  else  is  a  computer.  Faced  with  a  draft  translation  from  the  machine,  the  translator  no  longer 
feels  totally  responsible  for  the  quality  of  the  translation  but  obliged  to  accept  a  compromise. 

Some,  over-conscientious,  translators  will  tend  to  completely  rework  the  translation  to  the  detriment  of 
cost-effectiveness.  Others,  not  sufficiently  conscientious  or  influenced  by  the  machine  outputs  will  tend  to 
accept  the  draft  translations  as  such  to  the  detriment  of  quality. 

There  is  a  general  risk  that,  through  too  much  compromise,  the  translators  end  up  losing  interest  in  their 
work. 

These  general  problems  may  be  aggravated  by  others  of  a  more  specific  nature.  At  AEROSPATIALE,  most 
of  the  translators  have  received  a  Titterary  education  but  work  in  environment  which  is  hyper-technical. 
They  must,  therefore,  make  considerable  efforts  to  acquire  not  only  the  appropriate  technical  vocabulary 
and  style  but  a  wide  technical  knowledge  of  their  field  of  activity. 

The  technical  authors,  on  the  other  hand,  have  generally  received  a  purely  technical  education  and  must 
make  considerable  efforts  to  acquire  the  art  of  writing  correctly.  The  result  of  this  is  that  a  natural  balance 
develops  between  the  author  and  the  translator,  each,  as  it  were,  compensating  for  the  deficiencies  of  the 
other. 

The  arrival  of  an  MT/CAT  system  could  upset  this  balance'  The  translators  may  have  problems  adapting  to 
the  new  situation  or  may  even  be  obliged  to  change  -activity.  Indirectly,  however,  there  may  also  be 
problems  for  the  authors  whose  texts  are  not  always  suitable  for  direct  translation  by  a  machine.  In  order 
to  reduce  to  a  minimum  the  problems  related  to  the  integration  of  an  MT/CAT ivstem,  it  is  essential  to 
prepare  a  detailed  specification  and  to  associate  the  translators  and  the  authors'  in  the  preparation  of  this 
specification.  Without  this  specification,  there  is  a  seiious  risk  of  acquiring  a  system  that  is  not  adapted  to 
the  requirements.  Used  in  a  rational  manner,  in  a  well  defined  and  controlled  context,  MT/CAT  could, 
liberate  the  translators  from  tedious  routine  translations  and  leave  them  more  time  to  concentrate  on  the 
translations  that  are  ill-adapted  to  MT/CAT  or  that  require  special  attention. 
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The  fact  is  that  many  of  the  problems  associated  with  MT/CAT  are  due  not  to  the  system  itself  but  to  a  poor 
evaluation  of  the  needs  ana  a  false  idea  of  the  capabilities  of  the  system. 

Only  too  often  there  is  a  certain  reluctance  to  accept  the  studies  carried  out  by  the  translators  themselves 
on  the  grounds  that  they  are  too  anxious  to  protect  their  profession  and  refuse  progress.  In  reality, 
however,  translators  in  general  and  the  AEROSPATIALE  translators  in  particular  have,  by  the  very  nature  of 
their  work,  developed  a  considerable  capacity  for  adaptation.  They  are  fully  aware  of  tne  fact  that  it  is 
better  to  actively  participate  in  and  thereby  influence  progress  rather  than  to  obstruct  it  artificially  and 
finally  have  it  imposed  upon  them. 

It  is  therefore  essential  that  the  translators  play  a  predominant  role  in  the  preparation  of  the  specification 
relative  to  MT/CAT  and  that  they  participate  in  the  decision  as  to  the  eventual  choice  of  a  system.  It  is  thus 
possible  to  avoid  problems  arising  from  a  ooor  understanding  of  the  systems,  of  what  they  can  do  and 
what  they  cannot  do. 

Either  the  MT/CAT  system  is  adapted  to  the  real  needs  of  the  given  context,  in  which  case  there  are  few 
problems  whether  psychological  or  otherwise,  or  it  is  not,  witn  all  the  consequences  that  this  implies. 

3.  CONCLUSION 


There  is  currently  a  wide  range  of  MT/CAT  systems,  both  large  and  small,  on  the  market  capable  of  providing 
valuable  services  in  well  defined  contexts. 

Too  often,  however,  the  acquisition  of  an  MT/CAT  results  in  failure.  In  most  cases  this  failure  is  not  due  to  the 
system  itself  but  father  to  the  fact  that  it  is  not  adapted  to  the  specific  user  requirements  either  because  these 
and  the  integration  of  the  system  were  not  sufficiently  analyzed  or  because  the  capabilities  of  the  system 
were  over-estimated  for  the  given  application.  Several  "general  purpose"  systems,  after  a  commercially 
successful  period  have  seen  tneir  sales  drop  dramatically.  Other  systems,  designed  in  a  given  context  to  meet 
clearly  defined  requirements  continue  to  give  full  satisfaction  and  in  these  cases  the  research  and 
development  is  being  actively  pursued  to  further  increase  their  performance.  At  the  present  time,  it  would 
seem  as  though  MT/CAT  is  going  through  a  period  of  transition  where  it  is  important  to  learn  from  the  errors 
of  the  past  to  better  prepare  the  future. 

At  AEROSPATIALE,  the  aircraft  technical  publications  are  also  entering  a  period  of  transition  and  substantial 
investments  are  being  devoted  to  preparing  the  future.  The  technical  publications  systems  are  being  totally 
redesigned  to  enter  the  era  of  fully  computerized  transmission  of  technical  data'.  The  major  preoccupation  is 
to  develop  open-ended  systems  compatible  with  the  new  concepts  of  data  management,  organization, 
transmission  and  utilization.  AEROSPATIALE  is  currently  capable  of  providing  technical  publications  on 
CD-ROM  and  is  actively  working  towards  the  direct  consultation  by  the  Airlines  of  the  Manufacturer  data 
bases.  An  advanced  studies  group  is  starting  to  investigate  the  yet  unexplored  field  of  "intelligent"  graphics. 
All  these  developments  have  and  will  continue  to  have  repercussions  on  translation  and  the  needs  for  and 
utilization  of  MT/CAT. 

In  the  constant  search  for  new  ways  of  reducing  production  costs  and  lead  times  while  increasing 
productivity,  MT/CAT  was  immediately  seen  as  a  valuable  means  of  contributing  to  these  objectives.  However, 
this  position  has  now  been  modified  mainly  because  of  the  problems  involved  in  integrating  MT/CAT  in  the 
new  technical  publications  production  systems. 

The  current  developments  do  not  facilitate  this  integration  and,  in  fact,  the  whole  technical  publications 
environment  is  moving  so  fast  that  it  is  difficult  to  precisely  define  needs.  It  is,  however,  certain  that  an 
eventual  MT/CAT  system  would  have  to  be.capable  not  only  of  adapting  to  this  environment  but  also  of 
evolving  with  it.  The  MT/CAT  systems  available  today  are  best  suited  to  fairly  long  texts  with  conventional 
layout. 

However,  the  current  trend  with  aircraft  technical  publications  is  to  break  down  the  texts,  whether 
descriptive  or  procedural,  into  small  highly-coded  documentary  units  of  just  a  few  lives.  These  documentary 
units  contain  little  "free”  text  but  include  numerous  codes  for  calling  up  precise  items  of  technical  data  or 
standard  terminology/sentences  from  associated  source  files.  It  must  be  stated  that  these  new  production 
systems  were  designed  to  satisfy  a  certain  number  of  essential  requirements  and  that  the  integration  of 
MT/CAT  was  not  one  of  them.  Due  to  the  complexity  of  this  integration  and  the  risk  of  having  an  adverse 
effect  on  overall  system  performance  notably  to  the  detriment  of  the  Partners  who  do  not  require  MT/CAT 
capabilities,  the  MT/CAT  solution  has,  for  the  moment  at  least,  been  abandoned.  Efforts  are  how  being 
concentrated  on  getting  the  technical  authors  to  write  directly  in  English.  It  must  be  admitted  that  the 
majority  of  texts  in  the  Aircraft  Maintenance  Manual  do  not  present  any  major  linguistic  difficulties  and  that 
the  author  aids,  developed  with  the  assistance  of  the  translators  and  built  into  the  system,  greatjy  facilitate 
this.  The  translators  are  of  course  available  for  linguistic  assistance  or  the  translation  of  more  complicated 
texts  when  required.  This  method  of  working  has  the  added  advantage  of  being  consistent  with  the  basic 
philosophy  of  the  new  technical  publications  production  systems  which  calls  for  immediate  availability  of 
data  through  on-line  acquisition  by  the  authors. 

As  far  as  technical  publications  are  concerned,  therefore,  the  role  of  the  translator  is  evolving  towards  more 
and  more  terminology  as  opposed  to  translation  as  such.  Indeed  the  translators  are  involvecffrom  the  earliest 
stages  of  a  program  in  preparing  the  terminologyspecific  to  that  program  (equipment  and  system 
designations).  This  terminology  will  be  used  on  tne  design  drawings,  on  the  aircraft  itself  and  throughout  the 
technical  publications.  The  role  of  the  translator,  therefore,  is  increasingly  to  initialize  author  aids  integrated 
in  the  technical  publications  production  systems. 
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There  remains,  however,  a  wide  range  of  other  documents  for  which  the  services  of  a  translator  are  still 
required  although,  here  too,  there  is  an  ever-increasing  tendency  to  write  or  use  the  more  simple  documents 
directly  in  English,  A  significant  proportion  of  the  miscellaneous  documents  requiring  translation  is 
compatible  with  MT/CAT  but  the  volume  they  represent  within  Product  Support  alone  is  not  sufficient  to 
warrant  the  acquisition  of  a  sytem.  This  could  only  be  justified  by  offering  tne  services  of  the  MT/CAT  system 
to  a  wider  population  within  the  Company, 

Despite  the  difficulties  currently  encountered  with  MT/CAT,  the  translators  in  the  AEROSPATIALE  AIRCRAFT 
DIVISION  PRODUCT  SUPPORT  organization  continue  to  follow  and  participate  in  the  development  of  various 
M'/CAT  systems.  They  have  prepared  a  specification  for  an  MT/CAT  system  adapted  to  their  specific  working 
environment  but,  to  date,  no  system  has  been  found, that  meets  the  requirements. 

A  study  group  organized  by  these  same  translators  is  tending  towards  the  definition  of  what  has  been  termed 
a  "translator  workstation"  rather  than  towards  conventional  MT/CAT.  These  workstations,  better  adapted  to 
the  diversity  of  tasks  performed  by  the  translators,  would  be  connected  in  a  ring  to  a  server  and  dispose'of  a 
certain  number  of  shared  translation  aids  such  as  word  processing,  glossary  management  software,  reference 

documenfs  on  CD-ROM  and  could,  if  applicable,  integrate  an  MT/CAT  system.  These  workstations  would  be 
fully  integrated  in  the  Company  EDP  and  office  automation  environment : 

-  direct  access  to  the  EDP  network  for  consultation  and  update  of  the  files  managed  by  the  mainframe 
computer  and  particularly  those  related  to  the  technical  publications. 

-  direct  access  to  the  office  automation  network  to  facilitate  the  transmission  of  documents  between  the 
translators  and  their  "customers"  and  to  access  reference  data  bases  connected  to  this  network. 

The  translator  workstation  will  probably  constitute  an  intermediate  solution  for  the  short  and  medium  term 
but  it  is  difficult  to  forecast  what  the  long  term  situation  will  be.  At  present,  the  fact  that  English  is  considered 
as  the  international  aeronautical  language  and  used  as  the  official  language  in  the  AIRBUS  and  ATR  GIEs 
combined  with  the  fact  that  the  vast  majority  of  the  AIRCRAFT  DIVISION  products  are  sold  to  export  has 
prompted  AEROSPATIALE  to  make  considerable  efforts  to  produce  and  use  documents  directly  in  English. 
However,  the  recent  changes  in  the  European  political  scene  and  their  repercussions  on  the  international 
commercial  landscape  will  inevitably  have  an  impact  on  the  demands  for  translation. 

There  is  no  doubt  that  new  generation  MT/CAT  systems,  with  increased  performance  and  new  capabilities  will 
have  an  important  role  to  play  in  tomorrow's  world  of  international  communications.  It  is,  therefore, 
essential,  despite  the  difficulties  encountered  today,  that  Companies  whose  activity  is  largely  dependent  on 
international  commerce  and  therefore  international  communications  continue  to  participate  in  the 
development  of  MT/CAT  systems.  This  is  the  only  way  to  ensure  that  the  new  generation  systems  will  be  really 
adapted  to  their  requirements. 
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To  assure  international  communication  the  introduction  of  machine  translation  systems  is 
unavoidable.  To  be  of  use  in  practical  applications,  however,  a  system  must  fulfill  the 
criteria  of  operability  defined  by  the  end  user.  In  this  context,  two  different 
applications  must  be  contrasted,  namely  automatic  translation  with  the  aim  v  'athering 
information  for  internal  applications  on  one  hand  and  automatic  translation  wifi  i  me  aim  of 
producing  publications  for  external  recipients  on  the  other  hand.  In  the  first  case, 
throughput  and  coverage  of  the  system  lexicon  are  most  important  while  preservation  of 
layout  and  format  information  are  secondary.  In  the  latter  case,  a  much  higher  translation 
quality  is  required  to  ensure  user  acceptance.  To  be  effective  in  an  office  environment, 
additional  aspects  become  relevant:  user  interface,  integration  with  other  office  systems, 
efficient  lexicon  update  and  postediting  tools.  Necessary  for  all  types  of  applications  is 
an  intensive  end  user  training  and  continuing  support  from  specialist  consultants. 


In  the  light  of  the  explosion  of  knowledge  and  the  necessity  to  gather  and  exchange 
information  across  national  and  linguistic  boundaries,  the  introduction  of  machine 
translation. has  become  inevitable.  Human  translators  may  feel  threatened  in. their  job 
security  but  such  fear  is  usually  caused  by  a  misunderstanding  of  what  a  machine  translation 
system  can  and  cannot  do.  A  machine  translation  system,  even  the  most  powerful  one  available 
today,  will  not  replace  a  highly  qualified  human  translator.  There  is  no  linguistic  theory 
in  sight  which  would  permit  the  complete  and  unambiguous  analysis  or  generation  of  a  single 
natural  language.  In  other  words,  a  human  revision  of  machine-translated. fexts  will  always 
be  necessary,  and  certain  types  of  text  will  by  definition  be  reserved  for  human 
translation,  namely  any  text  in  which  nuances  of  style  need  to  be  preserved  or  in  which  the 
meaning  is  hidden  "between  the  lines".  These  types  of  text  unsuitable  for  machine 
translation  include  not  only  literary  works  but  also  political  speeches  and  quarterly 
reports. 

For  the  translation  of  texts  conveying  factual  information  such  as  technical  documentation, 
scientific  abstracts  or  fact  sheets  a  machine  translation  system  is  a  powerful  tool  able  to 
increase  a  translator's  productivity  by  several  factors  -  provided  the  system  is  designed 
with  the  translator's  requirements  in  mind.  To  define  the  needs  of  an  end  user  we  need  to 
differentiate  two  possible  applications,  the  use  of  machine  translation  to  gather 
information  for  internal  purposes  on  one  hand,  and  the  use  of  MT  with  the  aim  of  producing 
publications  for  external  recipients. 

Machine  translation  for  information  gathering  If  machine  translation  is  used  for  purposes  of 
information  gathering  it  is  likely  that  the  texts.to  be  translated  come  from  various 
heterogeneous  sources  and  cover  a  wide  range  of  different  topics.  For  an  MT  system  to 
operate  effectively  it  must  contain  a  very  large  lexicon  covering  many  different  subject 
fields.  However,  one  must  keep  in  mind  that  it  is  by  definition  impossible  to  incorporate  a 
"complete"  lexicon  for  all  applications.  The  general  vocabulary  of  a  language  like  English 
may,  amount  to  perhaps  300  000  entries.  Tne  sum  of  the  concepts  in  the  various  sublanguages, 
e.g.  in  medicine,  chemistiy;  data  processing  etc,  by  contrast  is  estimated  to  be  in  the 
area  of  30  to  50  million.  The  vocabulary  within  specific  subject  fields  increases  much  more 
rapidiy  than  that  of  the  general  language,  and  it  would  be  a  hopeless  task  to  try  and  keep 
up  with  the  lexical  change  in  all  subject  fields.  So  even  for  purposes  of  information 
gathering;  a  kind  of  specialization  is  necessary. 
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When  comparing  the  lexicon  size  of  different  machine  translation  systems  one  should  not  take 
a  vendor’s  figures  at  face  value.  The  structure  of  a  lexical  entry  may  differ  greatly  from 
one  system  to  another.  One  possibility  is  to  list  all  complete  word  forms  as  seperate 
entries,  For  a  language  without  major  inflection.like  English  this  might  not  be  too 
disadvantageous.  For  languages  like  German  or  French  however  such  an  approach  would  inflate 
the  lexicon  by  a  factor  of  ten  or  more,  without  increasing  text  coverage  at  all.  For  the 
German  verb  "bestehen"  for  example  there  wpuld  have  to  be  seventeen  separate  entries. 

Another  approach  will  carry  separate  entries  for  each  word  stem.  The  English  verb  "go”  would 
be  listed  under  three  entries,  for  "go",  "went”  and  "gone".  Again,  this  will  inflate  the 
number  of  lexical  entries.  More  modern  systems  will  have  just  one  entry  per  headword  and 
generate  all  inflected  forms  by  reference  to  grammar  rules  or  by  consulting  morphological 
tables.  Another.aspect  to  be  considered  is  the  treatment  of  compound  words.  Some  languages 
such  as  German  cari  form  new  terms  by  linking  existing  words.  In  the  majority  of  cases, 
these  compounds  can  be  translated  into  English  on  the  basis  of  the  individual  components.  A 
German  term  like  "Plattenspeichersubsystem"  will  translate  nicely  as  "disk  storage 
subsystem".  Provided  the  machine  translation  system  has  grammar  rules  which  are  able  to 
analyze  such  compounds,  a  lot  of  lexical  entries  are  superfluous. 

When  dealing  with  large  numbers  of  documents  from  heterogeneous  sources  it  would  be 
advantageous  to  automatically  identify  the  subject  field  of  the  text.  At  present,  no 
machine  translation  system  seems  to  have  this  feature.  The  same  word  of  course  will  denote 
entirely  different  concepts  depending  on  subject  field  and  require  a  different  translation. 

A  "trunk"  may  be  a  part  of  a  car  or  a  communication  line,  not  to  mention  a  suitcase-like 
container  or  an  elephant’s  proboscis.  To  ensure  an  adequate  translation,  the  MT  system 
needs  to  be  geared  to  the  relevant  subject  field.  If  it  cannot  be  done  automatically  the 
human  translator  needs  to  set  the  "bias"  manually. 

If  the  documents  to  be  translated  come  from  many  different  sources  it  is  highly  unlikely 
that  they  all  adhere  to  the  same  formatting  conventions'.  It  is  more  probable  that  a  variety 
of  usually  incompatible  editors  and  word  processors  are  used,  and  that  a  fair  percentage  of 
the  texts  may  even  be  composed  on  paper.  That  greatly  diminishes  the  chances  of  being  able 
to  process  the  material  in  machine-readable  form.  For  the  application  of  machine 
translation  of  sundry,  documents  for  purposes  of  information  gathering,  the  installation  of  a 
font  reader  should  prove  economical!  (Having  a  staff  of  typists  input  the  texts  received  is 
not  just  expensive  but  introduces  errors.  Font  readers,  to  be  sure,  are  not  perfect  but 
some  of  the  newer  models  are  able  to  handle  a  variety  of  fonts  and  have  "learning" 
capability,  i.e.  the  ability  to  adapt  the  recognition  to  specific  features  of  the  document. 

One  of  the  great  problems  in  the  production  of  multilingual  documents  is  the  need  to 
preserve  the  format  of  the  original.  That  aspect  is  fortunately  of  less  concern  in 
information  gathering.  It  is  usually  sufficient  to  extract  the  relevant  text  portions  from 
the  document.  Only  in  rare  cases,  such  as  the  interpretation  of  flow  charts  and  tables,  will 
the  page  layout  be  of  sufficient  interest  to  warrant  its  reformating.  As  a  rule,  a 
translation  of  the  running  text  will  be  enough  of  an  aid  for  a-specialist  to  understand  the 
content  of  the  document.  Should  the  automatic  treatment  of  the  documentprove  inadequate  a 
human  revision  or  even  a  completely  new  human  translation  might  be  added. 

Collecting  information  from  foreign-language  sources  involves  several  parameters.  Usually  a 
very  large  amount  of  text  needs  to  be  translated,  and  the  rapid  accessibility  of  the 
information  is  of  great  importance.  As  the  information  is  usually  utilized  by  specialists 
who  know  the  .subject  fields  well  the  stilistic  quality  of  the-translation  is  of  lesser 
consideration.  In  practical  terms  this  means  for  an  end  user  that  the  machine  translation 
system  has  to  be  able  to  process  a  lot  of  text  very  rapidly.-lf  a  thousand  documents  per  day 
need  to  be  translated  a  sophisticated  MT  system  with  superior  translation  quality  but 
insignificant  throughput  would  be  useless.  However,  as  was  pointed  out  earlier,  it  would  be 
impossible  to  have  a  single  translation  system  dealing' adequately  with  all  types  of  text  and 
all  subject  fields  simultaneously.  Therefore  it  might  be  worth  a  consideration  to  run 


separate  systems  for  different  document  types,  and  with  this  kind  of  specialization  slower 
systems  may  prove  to  be  adequate  in  throughput.  Rough  translations  for  the  purpose  of 
information  gathering  donotrequire  the  degree  of  linguistic  sophistication  in  the  system 
that  is  needed  for  texts  to  be  published.  Very  often  a  local  grammatical  analysis,  i.e. 
one  based  on  the  analysis  of  phrases,  may  suffice.  The  sentence  translated  at  the  phrase 
level  might.make  a  human  translator  throw  up  his  arms  in  despair  but  the  rough  content  might 
still  be  understandable  to  an  expert  in  the  field  of. the  text.  A  note  of  caution  should  be 
added:  a  phrase  level  translation  could  in  this  context  be  adequate  for  an  English  text,  or 
for  any  other  language  that  has  a  rigid  word  order.  It  would  probably  not  work  for  languages 
in  which  phrase  elements  are  not  necessarily  contiguous, e.g.  German. 

The  quality  of  a  translation  cannot  be  measured  in  percentage' points.  To  define  a  quality 
level  which  is  acceptable  in  the  area  of  information  gathering  is  impossible.  This  would 
hinge  on  aspects  such  as.the  type  of  source  language,  the  familiarity  with  the.  subject 
matter  on  the  part  of  the  reader  and  the  degree  of  precision  required.  These'factors  may 
vary  greatly  from  one' application  to  another. 

One  should  expect  a  machine  translation  system  intended  for  the  "quick  and  dirty" 
translation  of  large  volumes  to  run  on  main  frames,  perhaps  with  access  from  various  sites. 
However,  aside  from  the  fact  that  general  purpose  mainframes  are  notoriously  ill-suited  for 
the  processing  of  natural  languages,  there  is  another  factor  to  be  considered.  Wide 
accessibility  of  the  information  may  be  intended  in  some  cases.  In  other  environments,  the: 
information  may  be  classified  and  may  need  to  be  protected  against  illegal  access.  Such 
protection  may  be  difficult  if  the  translation  systerg  resides  on  a  general  purpose  mainframe 
which  is  widely  accessible.  The  problem  is  compounded  if  the  mainframe  is  integrated  in  a 
network.  A  possible  alternative  would  be  a  stand-alone  machine  translation  system  which 
could  more  easily  be  protected  and  which  could  be  made  radiation-free. 

One  of  the  goals  in  connection  with  machine  translation  for  such  purposes  is  the  automatic 
processing  of  the  facts  contained  in  the  documents  for  storage  in  a  data  base.  There  are 
currently  several  projects  under  way  with  this  goal,  e.g.  at  Siemens  in  Munich,  but  besides 
some  severe  theoretical  problems  there  is  the  enormous  expense  of  quantitative  work  load.  No 
quick  solution  is  to  be  expected  here. 

Machine  translation  for  the  production  of  publications  If  the  gathering  of  information 
requires  primarily  speed  of  throughput  in  the  translation  process,  the  use  of  machine 
translation  for  the  production  of  documents  to  be  published  for  an  outside  world  demands  a 
complete  package  of  solutions  to  be  viable. 

First  of  all,  it  is  usually  a  requirement  that  the  target  text  be  of  high  quality;  in  some 
cases  it  is  required  that  the  translated  text  should  not  be  recognizable  as  a  translation. 

This  presupposes  human  revision  of  the  machine  translated  text.  No  system,  no  matter  how 
sophisticated,  could  fulfill  this  requirement.  This,  however,  is  not  to  be  misunderstood  as 
a  naive  notion  that  all  systems  are  created  equal.  For  a  machine  translation  system  to  be 
used  effectively,  the  human  translators  have  to  accept  it  as  a  tool,  and  one  of  the  prime 
requirements  is  high  translation  quality.  If  translators  have  to  correct  too  large  a 
percentage  of  the  translations  proposed  by  the  machine  they  will  view  the  system  as  a  burden 
rather  than  an  aid  to  productivity.  And  if  the  system  is  not  acceptable  to  the  translators 
no  gain  will  be  realized.  On  the  contrary,  personnel  problems  might  develop. 

The  achievement  of  high  quality  translations  presupposes  a  thorough  linguistic  analysis.  No 
word  level  of  phrase  level  analysis  can  provide  the  basis  for  a  plausible  interpretation  of 
a  sentence;  tne  minimum  requirement  is  an  analysis  that  takes  all  elements  of  a  sentence 
into  consideration.  Such  an  approach  may  be  "expensive”  in  terms  of  computing  power  but  in 
the  area  of  natural  language  processing  there  is  no  choice.  The  many  ambiguities  in  the 
words  of  a  sentence  cannot  be  resolved  by  minimaiistic  local  analyses.  Natural  languages 
are  -  contrary  to  the  assumptions  of  the  past  -  not  finite  systems.  Therefore  it  is 
important  that  an  MT  system  has  a  grammar  which  does  not  just  list  a  finite  set  of  legal 
grammatical  structures.  Otherwise  the  next  document,  written  in  a  slightly  different  style, 
may  turn  out  as  unintelligible  gibberish. 
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Translation  quality  depends  to  a  large  extent  on  the  power  of  the  grammar,  but  equally 
important  is  the  information  contained  in  theJexicon.  And  here  it  is  an  indispensable 
requirement  that  the  translators  working  with  the  system  have  complete  access  to  the 
lexicon,  to  be  able  to  enlarge  and  adapt  it. 

No  matter  how  large  the  lexicon  supplied  by  the  vendor,  there.will  always  be  the  necessity 
to  update  it.  Specific  applications  in  defined  subject  areas  demand  the  use  of  specific 
terminology  usually  not  contained  in  general  lexicons,  and  in  industry  very  often 
company-specific  terminology  takes  precedence  over  more  general  terms.  The  lexicon  of  a 
powerful  machine  translation  system  contains  a  lot  of  grammatical  information  which  is  used 
for  the  analysis  or  generation  of  a  language  in  unison  with  the  grammar  rules.  Updating  the 
lexicon  does  not  just  mean  adding  word  pairs  in  two-languages  but  adding  morphological, 
syntactic  and  semantic  information.  A  system  might  need  to'know  that  "rely"  is  a  verb,  that 
it  occurs  with  the  preposition  "on",  that  its  inflection  is  regular  etc.  Keying  in  all  this 
information  may  be  quite  cumbersome  if  it  is  not  supported  by  tools.  In  one  commercially 
available  system,  coding  a  new  term  takes  a  full-fledged  linguist  half  an  hour.  Such  figures 
make  a  system  excessively  expensive.  In  another  system  (METAL)  the  coding  of  lexical  entries 
is  supported  by  an  integrated  expert  system  so  that  new  subject  fields  can  be  added  to  the 
system  lexicon  with  minimal  expense. 

But  not  only  the  coding  of  lexical  entries  needs  to  be  open  to  the  end  user.  An  acceptable 
translation  quality  can  only  be  achieved  if  the  translation  is  geared  to  a  specific  subject 
field.  Most  systems  nowadays  provide  a  framework  for  subject-specific  lexicon  modules  so 
that  in  the  translation  of  a  given  text  highest  priority  is  given  to  the  transfers  contained 
in  the  most  specific  module.  It  is  important,  however,  that  an  end  user  can  not  only  fill 
existing  slots  but  that  he  can  define  the  structure  of  lexicon  modules  himself.  There  is  no 
such  a  thing  as  the  universal  classification  system,  and  a  certain  user  in  the  field  of 
chemistry  may  have  entirely  different  requirements  for  his  lexicon  structure  than  a  user  in 
the  field  of  civil  engineering. 

If  machine  translation  is  used  to  generate  publishable  documents  the  integration  into  an 
office  environment  is  of  utmost  importance.  The  best  translation  quality  is  wasted  if  there 
is  no  smooth-running  sequence  of  steps  from  the  original  to  the  target  text.  First  of  all, 
the  original  text  needs  to  be  imported  from  an  external  source  into  the  machine  translation 
system.  There  have  to  be  physical  means  for  this  task,  floppy  disk  drives,  tape  units  and 
possibly  ethernet  connections  to  the  word  processing  systems  on  which  the  originals  are 
composed.  It  is  important  that  besides  the  text,  all  graphics  and  other  non-linguistic 
material  are  preserved:  This  is  not  a  trivial  task  as  we  are  still  faced  with  a  multitude 
of  imcompatible  editors  and  word  processors,  all  of  which  seem  to  encode  graphic  information 
differently. 

It  would  be  uneconomical  to  manually  extract  the  text  portions  to  be  translated.  So  there 
has  to  be  a  set  of  programs  to  automatically  indentify  the  translatable  text  portions  and 
separate  them  from  the  non-translatable  material  -  which  may.constitute  more  than  50  %  of  a 
page.  This  information  needs  to  be  preserved  so  that  it  can  be  used  to  reconstitute  the 
format  and  layout  of  the  original  page  after  the  text  has  been  translated  and  revised.  As 
it  does  not  seem  likely  that  such,  programs  will  work  perfectly  on  all  types  of  documents, 
the  end  user  must  have  the  capability  to  override  the  automatic  process  and  edit  the  various 
intermediate  versions. 

In  the  early  days  of  data  processing  the  user  had  been  expected  to  adjust  to  the  formalisms 
demanded  by  a  system.  Fortunately,  some  progress  in  this  area  has  made  the  end  user’s  life 
easier.  Most  machine  translation  systems. nowadays  offer  a  menu-driven  interface  which  is 
easy  to  operate  even  by  translators  who  are  not  knowledgeable  in  the  field  of  data 
processing.  By  clicking  on  pre-structured  command  lines,  translators  can.un  the  system 
without  endangering  its  integrity.  As  the  revision  of  a  machinetranslated  text  requires 
different  editing  functions  from  those  needed  in  general  word  processing  most  systems  offer 
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specialized  editors.  Typical  functions  include  global  replace,  transposition  of  words  or 
phrases  etc.  If  a  system  is  not  an  insular  implementation  of  a  machine  translation  system 
but  resides  as  an  add-on  module  to  standard  hardware  and  software  it  can  of  course  be  used 
for  other  purposes  as  well,  an  aspect  which  may  figure  in  a  cost/benefit  analysis. 

Machine  translation  systems  are  not  self-explanatory.  Even  if  end  users  are  supplied  with 
adequate  documentation  they  need  intensive  training  by  experts.  It  is  not  sufficient  to  be 
shown  the  surface  handling  by  a  salesman,  in  order  to  understand  the  operation  of  the  system 
properly,  users  need  to  be  shown  the  system  structure,  the  interrelation  between  grammar  and 
lexicon.  Only  then  will  they  be  able  to  avoid  costly  errors  in  updating  their  lexicon.  A 
machine  translation  system  should  not  be  a  black  box  but  a  transparent  tool  for  the 
translator.  As  updating  the  lexicon  is  one  of  the  major  tasks  for  translators,  a  fair 
amount  of  the  training  period  should  be  spent  on  all  aspects  of  lexicon  work.  The  training 
should  not  only  stress  the  linguistic  parts  of  lexical  entries  but  should  include  an 
analysis  of  the  end  user's  subject  areas  and  terminology  structure  as  well.  Finally  it  is 
important  that  an  end  user  is  shown  the  methods  of  postediting  a  machine-  translated  text  as 
this  differs  from  revising  a  "human"  translation.  Prototypical  systems  are  usually  not  fit 
for  productive  applications.  End  users  need  long-term  support  in  the  operation  of  their 
system,  both  in  hardware  and  software  maintenance  and  in  organizational  and  linguistic 
consulting.  As  more  progress  is  made  in  the  field  of  linguistics  it  would  be  wise  to  choose 
a  system  with  a  modular  structure  that  can  incorporate  future  developments  be  they 
additional  languages  or  more  powerful  linguistic  components. 


ASPECTS  OF  MACHINE  TRANSLATION  IN  THE 
UNITED  STATES  AIR  FORCE 

Dale  A.  Bostad 
Foreign  Technology  Division 
United  States  Air  Force 


Machine  translation  Is  used  In  the  USAF  to  translate  tech¬ 
nical  literature  in  a  vide  variety  of  disciplines  to  support 
studies  that  assess  the  capabilities  and  research  of  the  Soviet 
Union  and  other  countries.  Two  types  of  machine  translation  aro 
used:  partially-edited  machine  translation  with  hard-copy 
printing  of  the  text,  and  raw  machine  translation  for  rapid  in¬ 
formation  scanning  of  material  at  a  terminal. 

A  special  software  program  is  used  for  rapid  post-editing 
of  texts.  Potential  trouble  spots  and  ambiguities  aro  inter¬ 
cepted  and  corrections  are  made  by  post-editors. 

Raw  machine  translation  for  gistlng  large  volumes  of  infor¬ 
mation  has  proven  to  be  an  effective  tool  for  analysts  and  re¬ 
searchers.  Statistics  indicate  on  ever-increasing  use  of 
machine  translation  for  rapid  information  scanning. 


It  is  the  purpose  of  this  paper  to  describe  the  use  of  machine  translation 
in  the  USAF,  to  describe  the  product,  and  to  indicate  future  developments  in  the 
area  of  rapid  information  acquisition. 

Machine  translation  is  used  by  the  USAF  to  support  studies  carried  out  by 
scientific  and  technical  researchers  who  need  to  stay  abreast  of  foreign  devel¬ 
opments  in  a  wide  variety  of  technical  fields.  Most  of  the  translations  pro¬ 
duced  come  from  open-source  literature.  A  researcher  has  a  broad  base  of 
information  at  his  disposal,  including  extracts,  abstracts,  cover-to-cover 
translations  of  foreign  journals  produced  by  publishing  houses,  and  other 
studies  in  the  field.  Machine  translations  are  simply  another  source  of  infor¬ 
mation. 

There  are  two  general  ways  that  machine  translation  is  used  in  the  USAF. 
The  first  use  is  partially-edited  machine  translation  where  a  trained  linguist 
massages  or  corrects  the  raw  machine  translation,  bringing  it  to  a  higher  degree 
of  accuracy  and  readability.  The  second  one  is  the  use  of  raw  machine  transla¬ 
tion  for  information  scanning.  These  two  applications  will  be  discussed  in  some 
detail. 


Partially-Edited  Machine  Translation 

The  standard  product  of  the  Directorate  of  Translations  of  the  Foreign 
Technology  Division  (FTD)  is  partially-edited  machine  translation.  Between 
50,000  and  60,000  pages  of  Russian  text  are  translated  each  year  by  the  US  Air 
Force  Systran  Russian  system.  This  is  a  batch  MVS  system,  often  translating 
10,000  sentences  in  40  minutes  (clock  time).  The  system  operates  in  both  a 
classified  and  an  unclassified  configuration;  however,  the  majority  of  v-ransla- 
tions  are  from  open-source  literature  and  they  are  translated  on  the  unclassi¬ 
fied  system.  These  translations  produced  meet  the  standards  of  adequacy  for  the 
users  of  the.  product.  Machine  translation  has  gained  wide  acceptance  by  users 
because  it  provides  rapid  turnaround  of  information  and  the  translations  are 
technically  accurate  in  a  wide  range  of  technical  disciplines  and  are  readily 
comprehensible  to  a  subject-area  specialist. 

A  partially-edited  translation  is  produced  by  scrolling  through  the  entire 
translated  text  on  a  video-display  terminal.  However,  only  segments  of  the  text 
are  actually  post-edited.  In  fact,  only  about  20%  of  a  given  text  is  carefully 
looked  at.  What  is  to  be  edited  is  determined  by  a  software  program  called 
EDITSYS.  The  functioning  of  this  program-will  be  examined  in  some  detail. 


6-2 


EDITSYS 

EDITSYS  is  a  program  called  at  the  end  of  the  translation  procedure  that 
serves  to  direct  the  post-editing,  i.e.,  tells  the  editor  exactly  what  to  look 
at  and  edit.  The  program  identifies  trouble  areas  in  the  system  that  need  re¬ 
view  and  intercepts  these  conditions.  As  stated,  large  chunks  of  text  go 
through  unscrutinized  without  editing.  This  means  that  we  rely  heavily  on  the 
efficacy  of  the  linguistic  algorithms  and  our  large  dictionaries.  To  write  a 
program  like  EDITSYS  one  must  have  considerable  knowledge  about  the  strengths 
and  weaknesses  of  the  system  and-  the  programming  expertise  to  highlight  the 
weaknesses  for  review. 

The  program  itself  is  a  module  that  allows  us  to  go  in  and  test  at  the 
bit/byte  level  the  final  analysis  area  of  sentences.  Virtually  all  of  the  lin¬ 
guistic  macros  in  the  system  can  be  used  for  testing.  When  a  given  test  condi¬ 
tion  is  met  the  program  generates  a  full-width  line  of  a  certain  character  in 
front  of  the  condition,  and  this  line  is  interspersed  in  the  text  and  displayed 
on  the  screen.  As  an  editor  scrolls  through  the  translated  text  he  halts  when¬ 
ever  a  flag  line  appears,  and  makes  an  editing  decision.  If  no  editing  is  re¬ 
quired  he  continues  on  to  the  next  flag;  otherwise  he  corrects  the  error.  Post¬ 
editing  is  limited  to  the  immediate  environment  around  the  flag.  A  skilled  edi¬ 
tor  can  edit  15-20  Russian  pages  an  hour  using  this  technique. 

Flags  are  generated  by  EDITSYS  to  check  the  following  situations: 

1.  Not-found  words.  All  legitimate  not-found  words  or  words  incor¬ 
rectly  input  are  flagged.  True  not-found  words  are  now  relatively  rare,  since 
the  dictionaries  contains  200,000  entries. 

2.  Acronyms.  All  acronyms  are  checked  to  see  if  their  expansions  are 
correct.  Thousands  of  acronyms  are  expanded  in  the  dictionaries,  but  those  of 
three  characters  or  less  require  close  scrutiny. 

3.  Rearrangement.  Byte  144  indicating  rearrangement  is  flagged.  Ap¬ 
proximately  20%  of  Russian  sentences  are  rearranged  with  an  accuracy  rate  of 
90%.  One  sentence  out  of  ten  must  be  edited  where  words  or  phrases  are  moved 
into  incorrect  slots. 

4.  Contiguous  slashed  entries.  There  are  several  thousand  slashed 
entries  in  the  Russian  system,  and  when  slashed  words  in  English  occur  next  to 
each  other  smooth  reading  of  the  text  is  impeded.  The  most  frequent  occurrences 
are  adjective  +  adjective,  adjective  +  noun,  and  noun  +  noun. 

5.  Spurious  "good"  terms.  These  are  words  that  have  been  typed  in¬ 
correctly  but  which  match  up  against  the  dictionary.  Examples  are  BOLE  instead 
of  BOLEYE,  SOYA  instead  of  SLOYA,  and  BIT'  instead  of  BYT’. 

6.  Uncertainty  code.  Byte  57,04  is  tested.  This  uncertainty  code  is 
turned  on  in  certain  homograph  routines  at  the  point  where  the  logic  becomes 
tenuous,  there  is  no  statistical  evidence  for  one  dictionary  default  over  anoth¬ 
er,  and  in  fact  resolution  is  a  toss-up. 

7.  Problem  words.  There  is  a  flag  generated  for  certain  Droblcm 
words  (about  40  in  number)  which  the  system  has  not  been  able  to  resolve  with 
sufficient  accuracy.  This  category  is  fluid;  as  routines  or  expressions  are  de¬ 
veloped  for  these  words  they  are  no  longer  flagged.  Of  course,  new  conditions 
or  words  also  arise  which  require  flagging. 


Ban  Hacking.  Translation 

Three  years  ago  the  USAF  developed  a  new  application  or  its  machine  trans¬ 
lation  system  which  we  call  interactive  machine  translation.  This  system  gives 
all  users  individual  access  to  machine  translation  at  their  own  terminals.  It 
is  now  available  to  users  on  approximately  1400  PC's  within  the  Foreign  Technol¬ 
ogy  Division.  This  is  raw  machine  translation  without  the  mediation  of  transla¬ 
tors. 


The  system  is  designed  so  that  a  user  can  rapidly  determine  the  signifi¬ 
cance  of  the  material  he  wants  translated  ar.d  weed  out  extraneous  information. 
It  is  best  used  for  rapid  translation  of  titles  of  books,  tables  of  contents, 
captions  under  tables  and  graphs,  and  individual  sentences  and  paragraphs.  How¬ 
ever  it  can  also  effectively  be  used  to  translate  complete  short  articles  and  to 
get  back  a  rapid  translation  instead  of  going  through  the  sometimes  time-con¬ 
suming  operation  of  routing  translations  through  the  formal  bureaucracy.  One 
very  effective  use  of  the  system  is  for  gisting  a  large  book,  that  is  deter¬ 
mining  the  significant  parts  of  a  book  and  then  routing  this  material  through 
the  normal  translation  procedures.  For  example  if  a  user  has  a  350  page  book, 
the  system  might  be  used  to  determine  that  only  Chapters  3,  7,  and  12-15  are  re¬ 
ally  pertinent  to  his  research.  Obviously,  by  using  such  a  tool  there  can  be 
tremendous  cost  savings  by  not  translating  irrelevant  material. 


Computer.  Environment  of. Interactive  MT 


The  first  thing  that  had  to  be  done  to  develop  interactive  MT  was  to 
reconfigure  the  Systran  systems  to  run  under  IBM  VM/CMS  (Conversational  Monitor 
System)  operating  system.  The  seven-step  traditional  procedure  was  reduced  to  a 
single  step  and  the  two  IBM  sorts  were  eliminated.  In  their  place,  random  ac¬ 
cess  searching  was  used  in  main  dictionary  look-up.  Random  access  lookup  of 
words  is  very  efficient  when  processing  shorter  files.  These  were  the  changes 
required  as  far  as  Systran  was  concerned. 

The  system  was  then  loaded  on  an  IBM  mainframe  on  a  Systran  disk.  When  a 
user  is  connected  to  the  mainframe,  all  he  need  do  is  type  in  the  command 
SYSTRAN  and  he  is  automatically  linked  to  the  Systran  disk  on  the  mainframe. 
The  user  has  his  own  virtual  machine  running  on  the  host.  This  means  that  he 
commands  nearly  the  equivalent  computing  power  as  if  he  had  the  full  resources 
of  the  mainframe  at  his  disposal.  Thus,  on  either  the  unclassified  or  classi¬ 
fied  system,  all  the  user  has  to  do  is  type  in  the  command  SYSTRAN  at  the  CMS 
prompt  and  he  is  ready  to  execute  a  translation  session. 


interactive  hi  Henn 

The  interactive  menu  was  written  by  two  FTD  systems  programmers  using 
VM/CMS  and  XEDIT  and  REXX  macro  languages.  The  primary  consideration  was  to 
make  the  menu  simple  to  use  and  as  short  as  possible,  i.e.,  user-friendly.  I 
will  briefly  describe  the  menu  and  options  available.  After  typing  in  SYSTRAN 
the  first  panel  appears,  displaying 


SYSTRAN 

FTD's  Interactive  Language  Translation  System 

on  the  screen.  Specially  defined  function  keys  at  the  bottom  of  the  screen  then 
direct  the  user  to  proceed  to  the  next  menu,  or  exit.  All  subsequent  menus  have 
dedicated  function  keys,  explained  at  the  bottom  of  the  screen,  that  quickly  in¬ 
dicate  the  options  available  at  that  point  in  the  process. 

In  the  next  panel  the  languages  to  be  translated  are  displayed;  the  next 
panel  offers  a  selection  of  17  technical  dictionaries  that  can  be  selected.  The 
next  panel  allows  for  the  creation  cf  a  new  file  or  editing  of  a  previously-cre¬ 
ated  one.  If,  for  example,  a  new  file  is  to  be  created  and  the  file  name  is 
typed  in,  the  Enter  key  is  pressed  and  a  blank  file  appears  on  the  screen.  A 
press  of  F2  puts  the  user  in  the  "Power-Typing"  mode  under  XEDIT.  Once  a  file 
has  been  created  a  press  of  Flo  sends  the  file  to  be  translated.  The  words 

PLEASE  WAIT  WHILE  I  TRANSLATE 

are  displayed  in  the  upper  left-hand  corner  of  the  screen.  The  translated  En¬ 
glish  text  will  appear  on  the  screen  in  approximately  20-30  seconds,  depending 
on  the  length  of  the  file  and  activity  on  the  main-frame. 

If  all  words  are  translated  without  errors  the  translation  will  be  dis¬ 
played  with  the  message 


ALL  WORDS  WERE  TRANSLATED. 

If  there  are  untranslated. words  from  the  original  file  the  message 

NOT-FOUND  WORDS  EXIST 

will  appear  on  the  screen.  To  find  these  words,  the  user  presses  F9  and  the 
not-found  words  are  highlighted,  in  sequence,  in  the  original  input  file  by  a 
row  of  asterisks  above  and  below  the  not-found  word.  Once  a  correction  has  been 
made,  a  re-press  of  F9  brings  up  the  next  not-found  word,  and  so  on  through  the 
file.  After  all  corrections  have  been  made  the  user  is  told  that  no  more  not- 
found  words  exist,  and  to  press  ENTER  to  retranslate  the  file.  The  corrected 
file  is  then  re-translated,  in  approximately  the  sane  length  of  time. 

This  error-correction  process  is  a  unique  and  widely-used  feature  of  the 
menu.  Both  the  input  file  and  the  translation  file  are  permanently  retained  on 
disk  and  can  be  printed  out  on  local  printers. 


Use  Interactive  HI 

As  stated  previously,  the  primary  use  of  interactive  MT  is  to  provide  rapid 
translations  of  short  items  for  information  scanning.  The  system  is  used  by  an¬ 
alysts  and  analyst  assistants  in  a  secure  computer  environment  mode,  statistics 
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have  been  kept  on  usage  since  the  system  became  operational.  The  average  number 
of  accesses  per  month  over  a  two-year  period  are: 

Russian  185  accesses,  with  peak 

monthly  use  of  approxi¬ 
mately  625  accesses 

Genian  27  accesses  per  month 

French  25  accesses  per  month. 

In  addition  to  the  primary  use  of  the  interactive  system  -  a  quick  transla¬ 
tion  tool  providing  raw  MT  for  information  scanning  by  FTD  analysts  -  the  system 
has  several  peripheral  uses,  which  will  be  briefly  discussed. 

1.  The  system  is  being  made  available  to  select  Department  of  Defense 
components,  contractors  who  support  FTD  research  activities,  and  other  organiza¬ 
tions  involved  in  the  analysis  and  assessment  of  worldwide  technological  devel¬ 
opments.  These  remote  users  can  gain  access  to  the  translation  software  on 
FTD's  mainframe  using  a  modem  and  telephone  lines.  This  application  _s  rela¬ 
tively  new,  and  no  data  currently  exists  on  the  extent  of  usage. 

2.  The  interactive  MT  system  is  used  as  a  spelling  checker  for  cor¬ 
recting  typos  in  large  files  to  be  translated  by  the  batch  MVS  MT  system.  The 
rationale  is  very  simple:  the  fewer  mistyped  words  in  a  given  text  file,  the 
faster  the  not-founding  procedure  and  the  more  accurate  the  parse.  Although 
Systran  MT  systems  can  tolerate  a  percentage  of  not-founds  per  page  and  the  sys¬ 
tem  automatically  analyzes  the  function  of  a  not-found  word  or  typo,  the  greater 
the  accuracy  of  the  input  text  the  better  the  MT  results.  We  are  now  using  the 
correction  feature  of  the  interactive  system  to  clean  up  typing  mistakes  in  all 
material  that  is  to  be  sent  through  the  batch  MT  system  and  then  edited  by 
translators.  For  example,  a  local  external  contractor  who  keys  in  Cyrillic  text 
for  FTD  has  dial-up  access  to  FTD's  mainframe.  Once  the  text  has  been  input  it 
is  shipped  to  FTD's  mainframe  and  translated  via  the  interactive  system.  The 
correction  feature  then  highlights  the  typos,  they  are  corrected,  and  the 
cleaned-up  file  remains  on  disk  to  be  accessed  and  later  processed  by  the  batch 
system.  It  obviously  takes  more  time  to  translate  longer  files  through  the  in¬ 
teractive  system.  A  file  for  batch  processing  can  be  from  five  to  20  pages 
long,  and  hence  the  clock  tine  for  running  the  file  via  the  interactive  system 
may  be  15  minutes.  But  this  is  merely  computer  time.  The  gains  have  been  sig¬ 
nificant  in  productivity  by  using  the  correction  feature  to  produce  files  free 
of  all  typing  errors. 

3.  The  interactive  MT  system  under  CMS  has  been  downloaded  to  run  on 
various  configurations  of  stand-alone  IBM  personal  computers.  To  date  the  sys¬ 
tems  have  successfully  run  on  an  IBM  AT/370,  an  IBM  AT/370  with  an  A74  processor 
box,  and  the  new  PS-2  7437  IBM  workstation.  In  this  application  exactly  the 
sane  menu  is  used  as  was  described  earlier.  The  potential  use  of  stand-alone 
computers  is  evident:  for  users  in  remote  locations  without  access  to  an  IBM 
mainframe  it  is  the  perfect  solution. 

4.  The  interactive  system,  in  the  unclassified  mode,  is  used  within 
the  Directorate  of  Translations  in  several  ways.  First  of  all,  it  is  used  as  a 
quick  diagnostic  tool  for  developers  of  the  system  and  lexicographers.  But  it 
is  also  used  for  very  rapid  turnaround  of  short  documents  that  sometimes  seem  to 
get  lost  in  the  queue  behind  big  books.  Thus,  one  or  two  pages  are  input,  a 
hard  copy  of  the  document  is  edited  by  a  translator,  the  changes  are  entered  in¬ 
to  the  English  file,  and  the  document  is  then  printed  and  returned  to  the  re¬ 
quester  very  quickly.  Finally,  the  interactive  system  is  used  by  quality- 
control  personnel  to  translate  omissions  detected  in  larger  translations  when  a 
large  document  is  being  quality-controlled.  Use  of  the  interactive  system  in 
the  unclassified  mode  is  steadily  increasing.  Total  accesses  now  average  274 
per  month,  with  peak  accesses  approaching  600  per  month. 


BlS  EUfeHES 

We  are  planning  to  develop  two  software  additions  to  our  MT  systems  in  the 
very  near  future  which  we  believe  will  greatly  increase  the  effectiveness  of  MT 
and- its  attractiveness  to  and  use  by  end  users. 

The  first  thing  we  plan  oh  doing,  is  to  give  the  end  user  the  capability  of 
creating  his  own  dictionary.  The  dictionary  will  come  in  two  forms!  (l)  a 
customer-specific  pc  dictionary,  and  (2)  a  customer-specific  dictionary  with 
topical  glossaries,  also  PC-based.  The  first  user-controlled  dictionary  allows 
the  user  to  supply  his  own  terminology  on  a  PC  which  will  supplement  the  main 
Systran  dictionary  with  meanings  and  grammar  codes  for  not-found  words  and  re¬ 
placement  meanings  for  existing  Systran  dictionary  entries,  individual  words 
and  word  expressions  may  be  entered.  The  customer  dictionary  with  topical  glos¬ 
saries  is  similar,  but  it  allows  the  capability  of  creating  and  modifying  16 
specialty  dictionaries  in  addition  tq  the  usual  Systran  technical  glossaries. 

The  customer-specific  dictionary,  in  both  forms,  is  an  override  dictionary 
to  the  Systran  main  dictionaries;  it  resides  in  a  buffer  and  allows  up  to  5000 
entries.  The  dictionary  at  no  time  ,is  permanently  merged  into  the  main  Systran 
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dictionaries,  and  hence  the  integrity  of  the  Systran  dictionaries  canr.ot  be 
jeopardized.  However,  the  user  does  have  the  ability  to  fully  control  the 

translation  of  certain  classes  of  words  and  phrases,  and  his  modification  deci¬ 
sions  -  for  good  or  bad  -  will  be  reflected  in  the  translations  his  system  pro¬ 
duces.  This  is  a  powerful  tool  in  the  hands  of  end  users  that  must  be  used  with 

foresight  and  care. 

The  customer-specific  dictionary  features  a  simplified  and  scaled-down  ver¬ 
sion  of  dictionary  coding  and  has  a  user-friendly  menu.  Modification  is  limited 
to  adding  new  words  and  contiguous-word  expressions,  deletion  of  words,  and  mod¬ 
ification  of  English  meanings  of  existing  words  and  expressions.  Complex  seman¬ 
tics,  the  ability  to  set  bits  and  bytes  on,  scanning,  and  if/or  statements  are 
not  allowed.  The  menu  asks  for  rudimentary  information  about  the  source 
word(s) ,  including  gender,  part  of  speech,  declension  class,  and  animation.  The 
menu  also  queries  information  on  number  and  declension  in  the  target  language. 
The  goal  is  rapid  dictionary  development  and  control  by  the  user  without  bur¬ 
dening  him  with  the  complexities  of  linguistics. 

The  customer-specific  dictionary  offers  several  distinct  advantages  to  the 
end  user.  First,  it  permits  the  user  to  control  certain  aspects  of  the  MT  pro¬ 
cess,  allowing  him  to  become  actively  involved.  There  is  much  data  confirming 
that  the  reception  of  MT  by  translators  or  analysts  is  increased  when  the  user 
feels  that  he  has  some  control  over  the  translation  process,  that  his  correc¬ 
tions  can  improve  the  system,  and  that  he  is  not  totally  at  the  mercy  of  an  im¬ 
personal  bV  k  box.  Moreover,  the  customer-specific  dictionary  is  an  efficient 
way  of  reso.ving  multimeaning  or  translation  preference  disputes  among  a  wide 
audience  of  users.  Finally,  if  any  particular  user  has  classified  terminology 
that  cannot  be  entered  into  the  general  Systran  dictionaries,  he  can  retain 
these  terms  in  his  user-controlled  customer-specific  dictionary. 

We  are  also  developing  a  post-processor  for  the  finished  machine-translated 
English  file,  beginning  with  English  translation  produced  by  the  Russian  system. 
The  idea  is  to  improve  the  readability  of  the  final  English  file  by  automatical¬ 
ly  manipulating  it  to  remove  instances  of  awkward  or  ungrammatical  usage  that 
make  the  translation  difficult  to  read.  In  a  limited  sense  it  would  be  a 
"translation  of  the  translation,"  but  would  specifically  address  certain  classes 
of  errors  produced  by  machine  translation.  It  would  incorporate  some  of  the 
features  of  what  are  called  grammar  checkers;  however,  the  errors  would  be  cor¬ 
rected  automatically.  Based  on  empirical  generalizations  made  from  a  large  cor¬ 
pus  of  raw  machine  output  a  limited  rule-based  parser  would  be  developed  for 
English  and  incorporated  in  a  post-processor. 

Initial  considerations  would  be  the  use  of  articles  in  English,  treatment 
of  noncopular  sentences,  animate/inanimate  pronoun  resolution,  and  rules  to  re¬ 
solve  certain  slashed  entries  (vse  =  all/entire).  An  example  of  the  latter  is 
that  in  English  one  can  say  "all  the  boys"  but  one  cannot  say  "entire  the  boys," 
or  one  can  say  "the  entire  group"  but  not  "all  the  group."  Although  these  is 
sues  are  dealt  with  by  the  current  MT  software,  the  coding  and  linguistic  soft¬ 
ware  has  become  so  complex  that  it  seems  easier  to  deal  with  certain  classes  of 
readability  problems  from  a  new  perspective  -  the  machine-translated  English 
file.  The  goal  is  to  write  a  limited  English  parser  that  will  produce  results 
without  investing  a  great  deal  of  money  in  the  development.  We  believe  this 
goal  is  attainable. 


Conclusions 

Machine  translation  has  undergone  a  long  evolutionary  development  at  FTD 
extending  over  more  than  20  years.  The  Russian  system,  e.g.,  has  consistently 
provided  fast-turnaround,  economical,  and  usable  translations  that  have  met  user 
requirements.  Specialty  dictionaries  that  provide  consistency  of  technical 
translations  have  been  continuously  developed  and  updated  over  the  years.  New 
language  pairs  have  been  added.  The  recent  development  of  offering  machine 
translation  directly  to  the  user-analyst  has  dramatically  broadened  the  scope 
and  acceptance  of  machine  translation.  The  use  of  rapid  raw-machine  translation 
for  obtaining  the  essential  information  content  of  short  texts  is  becoming  „.orc 
and  more  important.  This  seems  to  be  the  solution  to  dealing  with  the  informa¬ 
tion  explosion  we  are  now  witnessing.  Finally,  giving  the  end  user  local  con¬ 
trol  over  certain  types  of  dictionary  development  on  his  own  PC  will,  we 
believe,  foster  greater  interest  and  use  of  the  interactive  system  for  rapid  in¬ 
formation  acquisition. 
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In  Europe,  research  and  development  in  the  field  of  machine  translation  has  been  boosted  by 
the  EEC’s  EUROTRA  program.  However,  operative  systems  are  usually  based  on  older 
technologies,  as  in  SYSTRAN  or  LOGOS.  In  recent  years,  the  METAL  system  designed  by 
Siemens  has  proved  its  applicability.  As  it  exemplies  the  state  of  the  art  it  is  described 
in  detail.  METAL  is  a  modular  system  with  recursive  grammars  and  non-sequential  processing. 
It  contains  hierarchically  structured  lexicon  modules  to  facilitate  subject-specific 
translation.  The  end-user  is  provided  with  powerful  tools  to  update  his  own  lexicon.  METAL 
is  integrated  into  a  chain  of  automated  processes  from  the  acquisition  of  the  source  text 
to  the  production  of  a  camera-ready  version. of  the  target  text.  User  experiences  show  a 
marked  productivity  gain  and  a  reduction  of  turn-around  time. 


In  1 966,  the  ALPAC  report  ended  American  government  funding  for  the  development  of 
operative  machine  translation  systems,  citing  the  various  project  failures  and  pointing  out 
that  all  the  millions  of  dollars  of  support  had  not  been  able  to  establish  a  single 
operative  system.  Unfortunately,  the  positive  suggestion  to  invest  more  money  into 
theoretical  basic  research  for  machine  translation  was  overlooked  and  most  American 
projects  were  cancelled.  European  researchers  were  a  bit  less  affected  by  the  ALPAC  report 
since  they  had  not  received  large  amounts  of  government  funding  anyway  and  were  not  under 
the  pressure  to  produce  large  operative  systems.  Research  in  Europe  centered  mainly  in 
Grenoble  where  under  the  direction  of  Bernard  Vauquois  GETA  was  established,  and  at  the 
University  of  Saarbrjcken  which  eventually  received  funding  from  the  German  government  for 
the  development  of  the  SUSY  system.  As  with  most  university  projects,  it  was  not  io  be 
expected  that  commercially  viable  and  robust'systems  would  be  designed.  Lack  of  long-term 
financial  support  and  personnel  turnover  were  one  of  the  reasons,  the  lack  of  adequate 
hardware  for  such  applications  as  well  as  an  insufficient  linguistic  basis  were  another. 

In  retrospect  it  is  somewhat  strange  that  multilingual  Europe  had  not  been  more  active  in 
the  field  of  machine  translation.  The  first  commercially  available  system  was  Systran, 
designed  in  the  USA,  followed  by  Logos,  also  designed  in  the  USA.  Both  systems  are 
available  in  Europe.  Systran  is  offered  in  France  via  Minitel  through  Gachot  S.A.,  and  the 
Logos  system  is  marketed  as  a  software  package  on  IBM  mainframes.  Other  companies  such  as 
Weidner  (also  US  based)  did  not  survive  the  extremely  high  investment  necessary  to  come  up 
with  a  marketable  and  viable  product. 

Since  then,  the  further  integration  of  the  European  Community  has  sharply  increased  the 
need  for  operative  machine  translation.  Concurrently,  the  field  of  Computational 
Linguistics  has  finally  established  itself  at  various  universities,  from  Leuven  to 
Manchester,  Bergen  to  Nancy  and  Stuttgart,  to  name  just  a  few.  In  other  words,  the. base  for 
linguistic  work  towards  the  elusive  goal  of  high  quality  machine  translation  has’been 
broadened  considerably.  The  EUROTRA  project  sponsored  by  the  European  Community  may  not 
result  in  an  operative  system  in  the  near  future.  'ItmeVertheiess  has  done  a  lot  to 
promote  research  in  the  field/of  Computatfen&njdyui'irica  and  machine  translation  in. 
particular.  Certainly,  the  European  public  hu.rbac-O'V.i  more  aware  of  the  problems  imposed 
by  multilinguality.  As  all  national  language's  otthirmemb;* /grades  are, considered 
equal,  a  vast  amount  of  documents  must  be.trarisiated  to  and  from  several  languages. 

Already  the  European  Parliament  is  spending  more  than  halt'd  ids  budget  on  translation. 

Outside  of  public  administration,  industry  is  equally  affected. 
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Costs  for  research  and  develoDment  are  soirallina.  At  the  same  time,  the  life  expectancy  of 
a  newly  developed  technology  is  decreasing.  At  the  beginning  of  the  century,  a  n evi 
technology  could  be  expected  to  last  for  about  five  decades  before  being  superseded  by  the 
next  generation.  By  now  a  new  technology  may  be  obsolete  in  less  than  five  years,  and  the 
innovation  cycles  are  getting. shorter  still.  In  addition,  our  technology  is  changin  g,  away 
from  self-explanatory  implements  towards  more  and  more  complex  products,  '  he  concrete  and 
tangible  objects  of  the  past  did  not  require  extensive  documentation  since  information 
about  function  and  operation  of  the  device  could  safely  be  assumed  to  be  within  the  "world 
knowledge"  of  the  user.  However,  with  the  advent  of  miniaturization  of  devices  and  a 
gradual  shift  towards  abstract  implementations  of  problem-solving  tools  and  procedures  like 
software,  the  user  is  no  longer  in  a  position  to  comprehend  the  workings  of  such  a 
sophisticated  system  without  explicit  and  detailed  documentation. 

-This  combination  of  ever  shorter  innovation  cycles  and  an  increasing  amount  of 
documentation  per  product  leads  to  a  veritable  explosion  of  the  volume  of  documentation  in 
the  industrial  sector.  The  tremendous  costs  for  research  and  development  can  only  be 
recovered  if  larger  markets  are  found,  i.e.  if  export  of  a  product  can  augment  sales  in 
the  home  market.  This  however  necessitates  the  translation  of  the  relevant  documentation. 
Even  within  Europe.there  is  no  lingua  franca  which  would  be  understood  by  all.  Experiences 
show  that  among  Europeans  the  presumed  competence  in  a  foreign  language  is  usually 
overestimated.  There  are  very  few  engineers  wno  are  able  to  understand  a  complex 
foreign-language  description  of  a  complex  system  with  the  degree  of  precision  required  for 
the  error-free  operation  or  even  further  development  of  such  a  system.  The  same  holds  true 
for  the, exchange  of  scientific  research  results.  Unavailability  of  such  results  on  account 
of  language  barriers  can  lead  to  the  unnecessary  duplication  of  effort  or  to  costly  errors. 

Contrary  to  public  belief,  there  is  a  noticeable  shortage  of  technical  translators  which 
causes  great  concern  in  the  industrial  sector.  To  give  an  example:  the  complete 
documentation  for  a  public  switching  system  may  amount.to  more  than  100  000  pages.  As  on 
the  average  technical  translators  produce  about  1000  pages  per  year,  the  task  of 
translating  this  single  set  of  documentation  requires  about  100  man  years.  Any  company 
would  be  hard  pressed  to  find  sufficient  qualified  personnel,  and  even  if  twenty 
specialists  could  be  found  there  would  be  a  delay  of  five  years  between  delivery  of  the 
physical  product  and  its  operation.  Such  delays  can  easily  lead  to  the  loss  of  markets. 
Therefore,  besides  the  European  Communities  it  is  mainly  in  large  businesses  that  the  topic 
of  machine  translation  has  been  addressed.  Philips  in  the  Netherlands  is  developing.a 
prototype  named  Rosetta  using  isomorphic  grammars,  and  the  Dutch  software  firm  8SO  is 
working  on  a  system  named  DLT  which  attempts  translation  via  an  interlingua  based  on 
Esperanto.  WithinGermany,  the  most  headway  has  been  made  with  the  METAL  system.  As  it 
exemplifies  the  state  of  the  art  it  will  be  described  in  detail. 

Siemens  became  involved  in  the  area  of  machine  translation  in  the  late  seventies. 

Experiments  with  commercially  available  systems  proved  less  than  successful  so  a  decision 
was  made  to  start  a  research  and  development  project  with  the  goal  of  building  an  operative 
machine  translation  system  to  increase  the  productivity  of  the  in-house  translators  and 
reduce  turn-around  time.  In  1978  Siemens  entered  into  a  cooperative  agreement  with  the 
University  of  Texas  at  Austin.  The  Linguistics  Research  Center  at  UT  was  in  the  fortunate 
position  of  having  been  able  to  devote  many  years  of  research  to  contrastive  and 
computational  linguistics,  without  being  forced  to  satisfy  investors  by  marketing  systems 
prematurely.  The  Center's  work  was  conducted  under  the  title  of  "METAL",  and  even  though 
the  present  system  bears  no  resemblanceto  the  early  versions  the  name  has  been  retained.  A 
first  prototype  was  tested  in  1979.  The  large  program  written  in  FORTRAN  was  loaded  info 
the  largest  mainframe  available  at  the  university;  all  other  users  had  to  leave  the  system, 
in  the  experiment,  one  short  sentence  was  to  be  translated  from  German  to  English,  and  only 
the  pertinent  lexicon  was.  loaded.  Still  the  system  labored  and  labored  unjil  finally  a 
translation  appeared^  after  more  than  three  hours  I  On  one  hand,  the  experiment  proved 
that  the  linguistic  approach  in  METAL  might  work,  on  the  other  hand  it  showed  quite  clearly 
that  an  operative  machine  translation  system  needed  to  be  designed  and  implemented  in  a 
different  manner. 
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Hardware: 

By  now,  the  linguistic  component  of  METAL  is  written  in  CommonLisp,  the  other  functions 
such  as  the  text  processing  component  are  written  in  C.  The  system  is  implemented  on  a 
hardware  package  consisting  of  several  translator  workstations  and  a  dedicated  LISP  machine 
running  as  a  server  in  the  background.  The  hardware'configuration  looks  as  follows: 
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Symbolics  LISP  machines  are  small  enough  for  an  office  environment  but  very  powerful.  The 
translation  throughput  with  METAL  is  about  200  pages  per  day.  That  is  far  more  than  a 
single  translator  could  ever  postedit.  As  the  LISP  machine  is  a  single-user  system  it  is 
linked  via  Ethernet  to  a  multi-user  translator  workstation  running  under  SINIX.  From  these 
terminals,  translation  jobs  are  started  and  all  the  tasks  of  deformatting  and  reformatting 
and  postediting  are  handled.  The  translation  process  running  in  batch  in  the  background 
is  detached  from  other  processing  steps  and  does  not  interfere  with  any  of  the  tasks  at  the 
translator’s  terminal.  The  SINDrsystem  also  provides  the  interface  to  other  office 
systems,  e.g.  the  Siemens  or  IBM  office  environment!*  For  reasons  of  lexicon  integrity  and 
uniformity  of  terminology,  the  functions  of  lexicon  modification  and  structuring  reside 
centrally  on  the  LISP  machine.  This  physically  supports  an  organization  where  lexicon 
maintenance  is  performed  centrally  for  an  installation  and  ensures  that  responsibility  for 
the  lexicon  remains  with  the  terminologist  in  charge,  without  -  possibly  anonymous  - 
interference. 


* 


SINIX  is  the  UNIX  from  Siemens. 

UNIX  is  a  registered  trademark  of  AT&T. 
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System  Structure 

From  the  outset,  METAL  was  built  in  a  highly  modular  way  so  as  to  permit  the  inclusion  of 
new  elements  or  the  modification  of  existing  elements  without  major  ill  effect  on  the  other 
components!  There  is  a  language-independent  core  system  to  which  language-specific  modules 
for  analysis,  transfer  and  synthesis  are. added.  The  analysis  module  of  a  given  language  is 
designed  in  such  a  way  that  it  can  be  used  as  the  basis  for  transfer  to  various  target 
languages  without  any  modification.  This  decreases  development  time  and  expense  for  new 
language  pairs.  Furthermore,  the  "open"  system  structure  also  makes  METAL  an  adequate 
basis  for  future  applications  in  semantic  content  analysis  information  retrieval  or  as  a 
natural-language  front-end  for  expert  systems  or.data  bases.  Its  first  application, 
however,  is  machine  translation. 

Grammar 

As  there  is  at  present  no  linguistic  theory  available  that  would  describe  even  a-single 
language  unambiguously  and  completely  a  somewhat  eclectic  approach  has  to  be  chosen  in  the 
grammar.  METAL  employs  a  transfer  system  rather  than  an  interlingua.  It  seemed  that  to 
define  a  meta-language  incorporating  all  possible  features  of  many  languages  would  not  only 
be  an  endless  task  but  rather  fruitless  as  well.  Such  a  system  would  soon  become 
unmanageable  and  perhaips  collapse  under  its  own  weight.  If  on  the  other  hand  the 
intermediate  meta-language  were  reduced  to  a  manageable  level  of  abstraction  then  too  much 
surface  information  necessary  for  a  faithful  translation  would  be  lost.  Abstract  formulae 
describing  a  text  may  be  adequate  tor  a  rough  paraphrase  but  not  for  translation  with  the 
aim  of  publishing  the  target  document.  Tests  with  several  European  languages  have  shown 
that  at  least  between  these  related  languages  a  transfer  system  is  adequate.  METAL  uses 
basically  phrase-structure  rules  which  are  augmented  by  tests  on  the  constituents,  their 
interaction  and  various  other  constraints.  In  contrast  to  other  systems,  the  rules  are 
recursively  applied  so  that  their  number  can  be  kept  low.  To  illustrate  the  advantages  of  a 
recursive  system  let  us  take  the  following  (simplified)  sample  rules: 

rule  1  :  S  -NP  VP 
2  :  NP  -  DET  ADJ  N 
3:  ADJ -ADJ  ADJ 
4<:  ADJ-ADV  ADJ 

Rule  1  says  that  a  sentence  may  consist  of  a  noun  phrase  (NP)  and  a  verb  phrase  (VP),  rule 
2  that  a  noun  phrase  may  consist  of  a  determiner  (DET),  an  adjective  (ADJ)  and  a  noun  (N). 
Rules  3  and  4  on  the  other  hand  state  that  an  adjective  may  consist  of  two  adjectives,  or 
of  an  adverb  (ADV)  and  an  adjective  respectively  (of  course,  all  constraints  and  tesis  have 
been  left  out  in  our  sample  rules). 

Now  take  the  following  sentences: 

a.  The.old  car  runs. 

Two  rules,  1  and  2,  would  be  necessary  to  interpret  the  surface  structure  as  a  sentence. 

b.  The  very  old  car  runs. 

Here,  rules  1 , 2  and  4  would  lead  to  a  sentence  analysis. 

c.  The.  rusty  old  car  runs. 

Rules  1,2  and  3  interpret  the  structure  to  be  a  sentence.  According  to  rule  3,  the  two 
adjectives  "rusty"  and  "old"  are  interpreted  as  one  adjective  for  analysis  in  rule  2.  If 
we  continue  to  apply  rules  3  and  4  to  a  given  surface'  structure  we  can  reach  an 
interpretation  of  very  complex  structures,  even  of  something  admittedly  contrived  like: 

d.  The  very  rusty  shabby,  slightly  dented  comfortable  old  car  runs. 
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Imagine  having  to  construct  rules  like  NP  ■  DET  ADV  ADJ  ADJ  ADV  ADJ  ADJ  ADJN  to  analyze  a 
trivial  sentence  like  this...  A  conventional  machine  translation  system  usually  tries  to 
account  for  every  possible  surface  structure  with  a  separate  rule.  This  approach  assumes 
(falsely)  that  a  natural  language  is  a  finite  system  and  that  a  sufficiently  large  set  of 
individual  syntax  rules  would  eventually  cover  all  cases..  Aside  from  the  fact  that  for 
free  word  order  languages  this  is  intrinsically  impossible,  managing  tens  of  thousands  of 
individual  rules  is  very  difficult.  METAL  at  present  uses  no  morethan  600  grammar  rules 
but  is  nevertheless  able  to  deal  with  sentence  structures  it  has  never  encountered  before. 

On  account  of  its  recursive  structure,  the  grammar  does  not  need  to  state  explicitly  that  a 
certain  sequence  of  constituents  is  grammatically  legal  and  may  be  interpreted  as  a  unit. 

The  grammar  rules  in  METAL  will  generate  legal  structures  from  their  base  components.  In 
other  words,  the  METAL  grammar  is  an  "open"  system  whose  coverage  extends  far  beyond  the 
explicitly  stated  rule  content. 

The  grammar  rules  are  indexed  to  make  processing  more  efficient  and  also  to  allow  the 
partial  use  of  the  grammar  rules  for  e.g.  "quick  and  dirty"  translation  for  purposes  of 
information  gathering.  The  most  commonly  applied  rules,  e.g.  those  for  word  level 
morphology  and  for  frequently  occurring  basic  structures,  are  defined  as  the  most  basic 
level.  Higher  level  rules  deal  with  more  complex  or  even  ungrammatical  structures.  If  a 
given  surface  structure  can  be  analyzed  using  lower-level  rules  then  the  more  complex  and 
less  likely  rules  are  disregarded,  which  saves  processing  time.  If  no  interpretation  is 
possible  with  the  lower  level  rules  then  incrementally  higher  levels  of  rules  are  added  to 
the  lower  level  rules,  and  again  an  interpretation  is  attempted.  If  for  the  purpose  of  a 
rough  translation  only  the  lower  three  levels  of  rules  are  invoked  the  translation  result 
will  not  be  as  good  but  perhaps  still  quite  adequate  for  some  applications,  and  processing 
will  be  faster. 

A  second  principle  distinguishes  METAL  from  older  systems:  linguistic  parallel  processing. 

Of  course,  it  is  impossible  to  translate  at  the  word  level.  Not  only  may  single  words 
denote  different  concepts,  as  e.g.  a  "ball"  may  refer  to  a  formal  dance  or  a  round  object 
used  in  sports.  Much  more  problematic  is  the  fact  that  a  word  may  reflect  one  of  several 
different  word  classes,  each  with  a  different  syntactic  function  in  the  sentence.  The 
English  word  "back"  tor  example  can  be: 

noun:  His  back  hurts 

adjective:  The  back  issue  of  Punch... 

adverb:  Meanwhile,  back  at  the  ranch- 

verb  particle:  The  boss  paid  him  back 

verb  stem:  His  colleagues  back  him  up. 

A  decision  about  the  function  of  "back"  within  the  sentence  cannot  be  made  at  the  word 
level., However,  even  at  the  phrase  level  there  would  be  problems.  "Eating  ice  cream”  may 
be  considered  a  contiguous  phrase,  as  in: 

Eating  ice  cream  can  be  pleasurable. 

In  a  different  context,  however,  the  same  surface  string  would  not  constitute 
a  syntactic  unit,  as  in: 

Children  eating  ice  cream  can  make  a  mess. 

For  a  co'-"'  "‘  interpretation  it  is  indispensable  to  analyze  the  complete  sentence.  This  is 
esfv"  ial  when  dealing  with  free  word  order  languages  such  as  German,  where  one 

elf  .e  verb  may  occur  in  a  position  quite  distant  from  the  other  (separable 

pru,.,.v~,,  to'r  example).  In  METAL,  all  possible  interpretations  of  all  elements  in  a 
sentence  are  written  in  a  chart.  The  parser  builds  structures,  utilizing  the  grammar  rules 
and  information  contained  in  the  lexicon.  These  structures  are  weighted  based  on 
probabilities  and  compared.  Only  when  an  interpretation  spanning  the  whole  sentence  and 
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accounting  plausibly  for  all  elements  is  reached,  is  the  transfer  to  the  target  language 
attempted. In  other  words, mo  decision  about  the  function  of  a  sentence  element  is  made 
until  all. other  elements  have  been  considered  as  well.  This  is  computationally-expensive 
but  seems  to.  be  the  only  way  to  treat  a  natural  language  with  all  its  ambiguities.  If  no 
interpretation  spanning  the  whole  sentence  can  be  found  the  system  invokes  a  fail-soft 
mechanism  and  delivers  a  translation  of  the  individual  phrases  it  had  been  able  to 
interpret.  In  some  language  combinations  the  output  may  still  be  grammatically  correct. 
In  other  cases,  the  posteditor  has  to  correct  the  output.  At  the  end  of  the  analysis 
phase,  the  sentence  is  depicted  as  a  tree  structure.  Behind  each  of  the  nodes  is  an 
extensive  set  of  grammatical  and  lexical  information: 
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In  the  transfer  phase,  this  tree  structure  is  transformed  into  a  normalized  tree  structure 
appropriate  to  the  target  language: 


Ley  con 

No  machine  translation  system  can  operate  without  an  adequate  lexicon.  But  the 
overall  number  of  entries  in  a  system  dictionary  is  not  a  relevant  criterion  for  a 
qualitative  assessment  or  for  a  legitimate  comparison  of  different  systems.  For  one  thing, 
the  internal  structure  of  an  entry  may  differ.Perhaps  all  stems  or  even  all  tokens  of  a 
v.'ord  are  listed  separately  in  the  dictionary,  or  by  contrast  all  forms  may  be  subsumed 
under  a  single  entry,  with  internal  pointers  to  tables  and  rules  so  that  full  forms  can  be 
generated.  METAL  employs  the  latter  structure. 

Secondly,  it  makes  a  difference  whether  a  system  relieson  one  unidirectional  dictionary, 
with  a  direct  link  between  one  source  language  word  and  one  target  language  word,  or 
whether  multiple  dictionaries  are  used.  METAL  operates  on  both  monolingual  lexicons  and  a 
transfer  lexicon.  The  monolingual  lexicons  contain  morphological,  syntactic  and  semantic 
information  needed  for  the  analysis  and/or  generation  of  a  language.  The  transfer  lexicon 
provides  a  link  from  the  source  language  to  the  target  language,  indicating  under  which 
conditions,  in  which  cor  ^xtual  environment  and  in  which  subject  field  a  source  language 
entry  should  point  to  a  ecific  target  language  entry.  As  an  aexample,  the  German  verb 
"zerlegen"  would  be  translated  into  English  as  "analyze”  if  the  direct  object  has  the 
canonical  form  "Satz"  (sentence).  It  would  be  transferred  as  "dissect"  if  the  direct  object 
has  the  semantic  type  ’human’  or  ’animate’  but  it  would  be  translated  as  "disassemble"  if 
the  direct  object  is  concrete. 
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The  advantages  of  such  a  lexicon  structure  as  used  in  METAL  are  obvious.  The  extensive 
grammatical  information  contained  in  the  monolingual  dictionaries  needs  to  be  carried  only 
once,  even  if  many  different  entries  in  one  of  the  languages  correspond  to  the  same  entry 
in  the  other  language.  The  transfers  of  the  English  verb  "take"  for  example  may  fill 
several  pages  of  a  book.  If  each  transfer  entry  were  to  contain  all  the  morphological  and 
syntactic  information  for  "take"  as  well,  the  system  dictionary  would  be  inflated 
excessively.  Not  only  would  this  waste  storage  space  but  it  would  also  require  superfluous 
coding  efforts.  Moreover,  if  monolingual  and  transfer  lexicons  are  kept  separate,  the 
monolingual  entries  can  be  used  in  other  language  combinations  without  modifications. 

Another  aspect  of  a  lexicon  to  be  considered  is  the  organization  of  its  terminological 
content.  In  most  European  languages,  the  set  of  the  most  frequent  5000  words  makes  up 
approximately  90  %  of  any  given  text  (on  the  average).  Beyond  this  limited  set,  the  point 
of  diminishing  returns  is  soon  reached.  Increasing  an  undifferentiated  general  lexicon  to 
more  than  1 00  000  words,  for  example,  would  not  increase  text  coverage  significantly.  On 
the  contrary,  many  unpleasant  ambiguities  would  be  introduced  which  can  be  avoided  in  a 
modular  structure. 

The  METAL  lexicon  is  organized  as  follows:  There  are  modules  for  function  words  (FW)  like 
prepositions,  determiners  and  conjunctions,  for  general  vocabulary  (GV)  and  for  common 
technical  vocabulary  (CTV)  organized  in  a  tiered  hierarchy.  From  the  next  level  down,  each 
end-user  can  define  and  structure  his  own  modules  and  tailor  them  to  his  specific 
application.  For  in-house  applications  in  Siemens,  there  are  for  example  modules  like  Data 
Processing  (DP)  with  submodules  Software  (SW),  Hardware  (HW)  etc.  Furthermore,  it  is 
possible  to  define  transfers  on  the  basis  of  a  specific  customer,  a  specific  product  or  a 
specific  target  country.  Thus  a  text  translated  into  British  English  will  show  "lorry" 
instead  of  "truck"  for  the  USA,  and  a  text  intended  for  Spain  will  automatically  have 
"ordenador”  instead  of  the  Colombian  "computadora".  The  METAL  lexicon  structure  can  be 
visualized  like  this  (simplified): 
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Before  a  translation  run  is  started,  the  modules  appropriate  to  the  subject  area  of  the 
text  are  defined,  if  the  syntactic  and  semantic  criteria  for  the  selection  of  a  lexicon 
entry  are  met  and  there  are  several  candidates  for  transfer,  then  the  one  tagged  for  the 
subject  area  of  the  text  or  tagged  for  a  hierarchically  closer  module  is  chosen.  This 
assures  that  highest  priority  is  assigned  to  subject-specific  transfers. 

The  main  source  for  the  required  terminology  for  new  subject  fields  is  TEAM,  the 
multi-lingual  terminology  database  operated  by  Siemens  which  at  present  contains 
approximately  three  million  records  in  up  to  eight  languages.  An  interface  between  TEAM  and 
METAL  facilitates  the  installation  of  new  lexicon  modules. 

External  users  update  their  own  lexicons  with  the  aid  of  the  so-called  INTERCODER,  an 
integrated  expert  system.  It  guesses  at  the  morphological  and  syntactic  behavior  of  new 
lexicon  entries  and  proposes  the  necessary  coding;  the  missing  pieces  of  information  are 
inferred  from  a  set  of  rules  and  partial  information  already  contained  in  the  lexicon.  The 
INTERCODER  has  proven  its  usefulness  in  reducing  coding  time  by  a  factor  of  ten.  While  it 
is  not  recommended  to  alter  function  word'entries  (they  are  too  closely  linked  to  the 
grammar)  a  translator  may  code  all  other  word  classes  including  verbs.  Even  though  the 
grammar  rules  are  not  accessible  to  an  end-user,  the  transfer  lexicon  permits  significant 
syntactic  transformations.  On  top  of  being  able  to  specify  transfers  on  the  basis  of  the 
instantiation  of  frames,  the  presence  of  arguments  of  a  certain  semantic  type  or  a  specific 
canonical  form,  the  user  can  influence  the  target  structure  considerably.  Source  language 
active  structures  may  be  turned  into  impersonal  constructions,  roles  of  arguments  can  be 
changed,  complements  can  be  converted,  elements  can  be  added  or  deleted  etc.  Great  care  has 
been  taken  in  the  design  of  the  user  interface  so  as  not  to  overburden  a  translator  with 
linguistic  detail. 

Office  Environment 

An  operative  productive  system  needs  to  do  more  than  simply  translate  individual  sentences 
entered  from  the  keyboard.  Most  of  the  texts  which  have  to  be  translated  quickly  and  are  of 
great  volume  such  as  e.g.  technical  documentation  are  heavily  formatted.  In  some  texts 
more  than  half  of  the  characters  on  a  page  may  be  non-translatable  material,  notably  flow 
charts,  diagrams,  tables  and  various  control  characters  for  format  and  layout.  It  would  be 
highly  uneconomical  to  manually  extract  the  text  portions  to  be  translated  and  afterwards 
manually  re-input  them.  That  would  not  only  be  expensive  but  it  would  also  invite  errors 
in  the  additional  reformatting  tasks.  Therefore,  METAL  has  been  integrated  into  a  chain  of 
processes,  from  text  acquisition  via  automatic  deformatting  and  .translation  to  automatic 
reformatting  procedures.  A  translation  run  usually  goes  through  the  following  steps: 
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reader) 

Separation  of  language  and 
format  data 

Processing  of  special  formats 
(diagrams',  tables) 


Determination  of  words  to  be 
added  to  system  dictionary 


The  INTERCODER  -  an  interactive 
expert  system 


Translation 


Merging  of  language  and  format 
data 


Revision  of  translation 


Word  processing  system 
Printer  output 
Typesetting 


A  text  is  usually  received  in  machine-ieadable  form,  by  tile  transfer,  floppy  system  check 
the  pages  for  tables,  graphs  etc  and  mark  them.  They  identify  the  text  portions  to  be 
translated  and  generate  a  mask  of  the  page.  The  individual  translation  units,  usually 
sentences  but  in  the  case  of  headlines  or  table  entries  also  single  words  or  phrases,  are 
automatically  recognized,  numbered  consecutively  and  extracted  from  the  page  mask.  They  are 
written  into  a  text  file  and.transferred  to  the  LISP  machine  for  translation.  After 
translation,  the  file. containing  the  target  language  text  units  is  returned  to  the  SINIX 
system  for  post-editing.  Here,  the  translators  can  choose  whether  they  want  to  postedit  an 
interlinear  version  which  groups  single  source  language/target  language  units  sentence  by 
sentence,  or  work  on  two  windows  with  sourc®  and  target  text,  or  whether  they  prefer  a 
target  language  output  that  has  already  been  reformatted.  In  the  former  cases,  the 
posteditors  would  start  the  reformatting  program  after  having  made  their  corrections.  At 
the  end,  the  target  language  text  is  available  with  all  the  formatting  information  and  with 
the  same  layout  as  the  original. 
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Before  a  text  is  translated,  it  is  advisable  to  run  a  comparison  of  text  and  system 
lexicon.  As  linguistic  processing  is  based  not  only  on  grammar  rules  but  also  on 
information  contained  in  the  lexical  entries,  sentences  in  which  several  words  are  unknown 
to  the  system  are  difficult  to  analyze,  and  the  translation  is  likely  Jo  be  inferior. 

Therefore  missing  words  should  be  added  to  the  lexicon.  In  MTTAL,  the  comparison  of 
lexicon  and  text  produces  several  files.  One  is  a  list  of  unknown  words,  each  listed  with 
its  location  and  context  so  that  transfers  are  more  easily  found.  This  list  will  actually 
also  show  faulty  orthography  so  that  the  program  can  be  used  as  a  spelling  checker  as  well. 
The  second  output  is  a  list  of  compound  words  which  were  not  found  in  the  system  lexicon 
but  for  which  a  translation  is  proposed  on  the  basis  of  the  individual  components.  Here 
the  translator  is  called  on  to  make  sure  that  the  proposed  translations  are  appropriate  to 
the  subject  area.  The  third  output  is  a  text-based  glossary,  listing  source  term  and 
proposed  translation.  This  may  be  used  to  review  subject  area  adequacy  of  the  lexical 
entries,  it  is  also  useful  if,  in  a  large  document,  one  portion  is  to  be  translated  by  the 
machine  translation  system  and  the  initial  pages  are  written  in  a  style  which  makes  them 
unsuitable  for  machine  translation.  In  such  a  case,  the  human  translators  can  be  given  a 
glossary  of  exactly  the  terms  contained  in  the  pages  to  be  translated  so  that  they  don’t 
have  to  wade  through  mounds  of  subject-area  listings.  This  will  ensure  that  the  same 
terminology  will  be  used  throughout  the  whole  document. 

Quality  and  User  Experiences 

The  state  of  the  art  in  computational  linguistics  does  not  permit  the  perfect  translation 
of  random  texts.  Therefore,  if  a  text  is  translated  not  simply  for  the  purpose  of  getting  a 
rough  idea  of  the  content  but  with  the  aim  of  publication,  postediting  by  a  human 
translator  will  remain  a  necessity.  Even  if  a  system  is  tuned  for  specific  subject  areas 
there  are  still  sufficient  problems  in  linguistic  analysis,  especially  if  the  meaning  to  be 
conveyed  is  hidden  "between  the  lines".  One  should  not  attempt  to  measure  the 
"correctness"  of  machine  translation  in  percentage  points.  Just  as  with  human  translation, 
there  is  not  necessarily  a  single  solution.  The  quality  of  a  translation  does  not  hinge  on 
the  quality  of  the  translation  system  alone  but  is  equally  dependent  on  the  quality  of  the 
source  text.  Inputting  garbage  will  not  produce  poetry.  One  also  needs  to  consider  the 
intended  purpose  of  the  text,  expectations  of  the  readers  and  even  the  stylistic 
preferences  of  the,  post-editor. 

The  quality  of  a  machine  translation  system  can  only  be  judged  in  regard  to  the  questions 
if  translators  working  with  the  system  have  been  able  to  increase  their  productivity  and 
decrease  turn-around  time.  One  prerequisite  of  course  is  the  willingness  of  translators  to 
use  the  system  in  their  daily  work,  and  that  presupposes  not  only  a  fairly  high  level  of 
translation  quality  but  ease  of  operation  aswell. 

Machine  translation  is  a  recently  evolved  technology  and  is  as  such  vulnerable  in  its 
status.  A  new  technology  can  easily  be  proven  inadequate  or  even  useless  if  the  intended 
recipient  refuses  to,  accept  such  a  system  or  insists  on  applying  it  in  unsuitable  ways. 

Therefore  the  introduction  of  a  machine  translation  system  into  an  existing  organization, 
be  it  a  large  industrial  company  or  a  translation  bureau,  requires  several  steps.  First  of 
all,  end-users  must  have  a  clear  picture  of  what  can  be  expected  from  an  MT  system  and  what 
is  beyond  the  scope  of  today’s  technology.  Inappropriate  use  will  oniy  lead  to 
frustration. 

Once  the  conditions  for  the  installation  of  a  system  have  been  assessed,  i.e.  translation 
volume,  suitable  types  of  text,  hardware  environment,  and  a  positive  decision  has  been 
reached,  the  organizational  setup  needs  to  be  discussed.  From  which  sources  does  the 
translator  receive  the  original  texts?  Is  there  a  possibility  to  influence  the  style  of 
the  original,  to  impose  certain  guidelines  in  regard  to  complexity  of  verbal  expressions? 

And  can-the  customers  be  persuaded  to  use  standardized  formatting  and  layout  routines  so 
that  the  tasks  of  deformatting  and  reformatting  can  be  simplified? 
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Translators  using  machine  translation  systems  need  an  introductory  training.  It  should 
focus  on  a  general  introduction  to  the  system’s  structure  and  the  tools  it  provides. 

Equally  important  is  a  first  training  in  the  different  work  techniques  that  such  a  system 
requires.  Provided  that  the  reader  of  a  target  document  is  not  concerned  with  intricacies 
of  style,  the  post-editing  phase  of  a  machine  output  can  focus  on  changing  this  output  to' 
an  acceptable  version  with  the  least  effort.  Certainly,  a  given  version  could  be  rewritten 
in  various  ways,  sometimes  with  a  gain  in  quality  but  sometimes  also  with  simply  an 
idiosyncratic  change  of  style  without  improvement  of  quality. 

Postediting  machine  output  is  different  from  revising  a  "human"  translation.  While  the 
machine  will  make  "severe"  errors  in  syntax,  e.g.  in  prepositional  phrase  attachment,  or 
semantics  in  ambiguous  structures,  a  human  translator  will  make  fewer  but  random  and  less 
predictable  errors.  Usually;  it  takes  a  translator  several  weeks  of  practical  work  with  an 
MT  system  to  be  able  to  anticipate  the  common  errors  perpetrated  by  the  system  and  look  for 
them.  Experiences  with  more  than  a  dozen  METAL  installations  have  been  quite  positive  and 
can  be  summarized  as  follows: 

Translators  as  well  as  upper  management  have  to  understand  that  a  machine  translation 
system  is  not  a  substitute  for  a  highly  qualified  translator  but  no  more  and  no  less  than  a 
powerful  tool. 

For  the  use  of  METAL,  an  initial  training  period  of  one  week  has  been  sufficient:  A  second 
week  of  training  after  a  few  months  answers  questions  which  have  arisen  during  the  actual 
productive  application.  After  that,  consultation  on  a  case  by  case  basis  seems  adequate. 

During  the  first  few  months  of  operation,  the  translators’  productivity  will  actually 
decrease.  There  is  the  initial  overhead  of  bringing  the  lexicon  up  to  a  level  where  it 
covers  most  of  the  specific  texts  to  be  handled;  Also,  translators  have  to  get  used  to  the 
different  work  technique  and  acquire  skills  in  lexicon  building  and  system  administration. 

After  this  initial  learning  phase,  which  may  vary  from  a  few  months  to  more  than  a  year, 
users  have  reported  considerable  gains  in  productivity  and  a  decrease  in  turn-around  time. 

It  appears  that  under  favorable  conditions  a  productivity  gain  of  a  factor  2  to  3  is  a 
realistic  goal.  In  addition  to  the  benefits  derived  from  increased  productivity,  the 
consistency  of  terminology  throughout  all.  documents  has  been  viewed  as  a  qualitative 
improvement  of  the  target  text  which  could  not  have  been  achieved  with  "human"  translation. 

METAL  is  now  available  as  a  product.  Development  will  continue  to  integrate  additional 
language  pairs  and  to  streamline  the  interface  to  various  office  environments.  Further 
research  will  focus  on  add-on  semantic  components  and  linking  METAL  to  data  bases,  expert 
systems  and  teaching/learning  systems.  Even  if  the  state  of  the  art  does  not  permit  the 
ideal  solution  in  the  area  of  natural  language  processing  it  seems  that  systems  such  as 
METAL  can  contribute  decisively  to  an  improvement  of  multilingual  communication. 
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Synthase  des  solutions  proposes  aux 
util isateurs 

filbert  TANEZ 

Conseiller  du  Oirecteur 

Centre  de  Documentation  de  l 'flrmement 

26,  Bd.  Victor, 

75996,  Paris-firmPes. 

France , 


Resume : 

L'utilisateur  potential  de  ta  TfiO, 
c’est-P-dire  chacun  d’entre  nous,  et  plus 
encore  celui  dont  le  metier  est  de 
traduire,  doit  pouvoir  connaitre,  atteindre 
et  utiliser  la  solution  la  plus  adaptPe  a 
son  cas  particulier  afin  de  rpduire 
I'obstacle  linguistique  qu’il  rencontre  a 
son  niveau  pour  communiquer  au-deia  du 
ghetto  de  sa  propre  langue. 

On  trouvera  done  dans  cette  synthase  une 
Enumeration  de  solutions  constituant  des 
recours  possibles  et  allant  du  dictionnaire 
ou  de  la  banque  de  donnees  terminologiques 
jusqu'a  des  logiciels  de  gestion 
lexicographique  ou  d'analyse  de  texte  ou 
d'aide  a  la  traduction,  etant  entendu  que 
dans  tous  les  cas  l’utilisateur  aura  i 
apporter  un  concours.  Les  resultats  seront 
tantOt  directement  utilisables,  tantot 
frustres,  de  sorte  que,  selon  le  cas,  on 
paurra  s'en  contenter  ou  bien  on  aura  a 
faire  appel  a  une  autre  assistance,  celle 
d’un  traducteur  ou  d’un  rpviseur  humain. 

En  attendant  le  systpme  parfait  et  ideal 
de  demain,  qui  pour  longtemps  encore 
restera  en  laboratoire,  il  n'existe  pas 
d'autre  issue  que  d’Ptablir  un  pont,  une 
cooperation  etroite  entre  concepteurs  ou 
vendeurs  de  systemes  d'une  part  et 
utilisateurs  d'autre  part.  C'est  de  cette 
cooperation  qu 'emergeront  les  solutions. 
Elies  ne  sont  pas  offertes:  il  faut  les 
construire. 

C’est  pourquoi,  au-deia  des  ’produits* 
(outils  ou  systemes)  presents  sur  le 
marche  de  I’industrie  de  la  langue  et  de 
ta  TfiO  en  particulier,  on  met  l ’accent  sur 
des  aspects  socio-politico-economiques  qui 
sont  loin  d'etre  negligeables  si  l 'on  veut 
atteii.Hre  de  vraies  solutions,  c'est-S-dire 
une  sit  uation  oO  l ' introduct  ion  de  ces 
outils  et  de  la  TfiO  devient  facteur  de 
productivite,  d'ouverture  et  de  progrPs.  Or 
cette  situation  a  dPjP  ete  atteinte  par 
certains  utilisateurs  qui  ont  su  faire  un 
pas  pour  ouvrir  la  voie  dans  laquelle 
d'autres  peuvent  aussi  s'engager,  seuls  ou 
en  concertation. 

oooOooo 


1.  Examen  des  solutions  Pventuelles. 


Solution  n*  1:  pousser  en  avant  sa  propre 
langue.  Cette  solution  n’  apporte  pas  la 
rpponse.  O’abord  parce  qu’elle  conduit  a 
terms  a  la  degradation  lente  mais  certaine 
de  cette  langue.,  qui  ne  s'appuie  plus  sur 
I’identite  cultiirelle  d’un  mfime  peuple  ,  et 
se  trouve  utilisPe  par  des  partenaires  en 


situation  d’inegalite,  ce  qui  engendre 
malentendus  et  frustrations.'  C'est 
neanmoins  une  maniPre  de  renforcer  le 
lien  entre  peuples  amis  ou  cousins. 

C6tp  anglais,  on  a  en  mPmoire  les  remarques 
rpcentes  du  prince  de  Galles  qui  deplore  la 
derive  de  la  langue  anglaise,  un  certain 
laxisme  qui  se  rppand  dans  son  usage.  COtP 
francophonie  on  ressent  la  npcessitp  d’une 
certaine  vigilance  (Comite  International 
de  la  Langue  Frangaise(CILF) . . . 


Solution  N*2:  Rechercher  un  interlangage 
comme  la  langue  artificielle  prPconisPe  par 
l 'International  Ruxiliary  Language 
fissociation  (elements  linguistiques  communs 
aux  langues  romanes  et  P  l ’anglais)  ou  une 
lingua  franca  univeroelle  comme  I'espPranto 
du  Dr.  Zamenhof.  Solutions  utopiques, 
rappelpes  nPanmoins  ici  pour  mPmoire. 


Solution  N*  3:  Encourager  et  faciliter 
I'etude  des  langues,  rendre  les  gens 
polyglottes.  C'est  sans  aucun  doute  une 
bonne  chose  mais  en  soi  insuffisante.  On 
croit  connaitre  une  langue  Ptrangpre  alors 
qu’on  n’en  connait  que  les  rudiments,  et  le 
problPme  demeure  quand  on  est  place  dans  un 
cadre  de  communication  prof essionnel le  dans 
une  langue  non  vernaculaire .  Il  n'empeche 
que  nombre  de  pays,  dont  la  France, 
pourraient  utilement  aller  plus  loin  dans 
l 'enseignement  des  langues,  notamment  Chez 
les  ingpnieurs.  Il  se  trouve  qu ’au jourd ’hui 
de  nombreux  outils  existent  pour  faciliter 
cet  enseignement  dps  le  stade  de  la  petite 
Pcole,  par  example  CD  TEL,  ou 
l 'enseignement  assiste  par  ordinateur 
s’appuyant  sur  le  minitel  et  le  bisque 
compact . 


Solution  N*  4:  Faciliter  I’accps  aux 
diet ionnaires  p’  ctroniques  et  banques  de 
donnpes  et  autres  outils  terminologiques. 

La  plupart  des  diet ionnaires  mono  ou 
multilingues  de  bon  renom  et  d’usage 
courant  sont  numprisPs  et  accessibles  en 
ligne  ou  sur  microordinateur  P  partir  d’un 
disque  compact,  par  exemple 
Collins-on-line,  distribup  par  Softissimo 
(France),  Robert,  Hachette,  New  Oxford 
English  Dictionary  (NOEO),  RProspatiale.  Il 
en  est  de  mSme  des  banques  de  donnpes 
terminologiques: 

-  Eurodicautom,  sur  serveur  Echo,  contenant 
des  centaines  de  milliers  de  termes  et 
phrases  de  contexte  et  des  dizaines  de 
milliers  d’abbrPviations, 

-Normaterm,  contenant  100000  termes 
frangais  et  anglais  extraits  des  normes 
frangaises  et  internationales  et  des  textes 
rpglementaires , 

-Termdok,  sur  disque  compact,  donnant  accPs 
a  225000  termes  avec  definitions,  en  huit 
langues,  et  regroupant  sept  banques  de 
uonn6es  terminologiques, 


-Termium,  congu  d’abord  pour  verifier  et 
normaliser  la  terminologie  dans  les  deux 
langues  du  Canada,  mais  pgalement  comme 
systPme  d'aide  aux  traducteurs, 

-Termnet,  rpseau  international  pour  la 
terminologie,  qui  produit  et  diffuse  des 
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publications  ou  des  produits  et  services 
dans  le  secteur  de  la  terminologie,  a 
I'dchelle  internat ionale . 

-TDB  (terminology  data  bank)  integree  dans 
un  systeme  d'aide  aux  traducteurs  (Carnegie 
Melton  University). 

La  liste  n’est  probablement  pas  exhaustive 
mais  il  s'agit  d'outils  avec  lesquels 
beaucoup  de  traducteurs  sont  deja 
familiarises,  qu'ils  aient  ou  non  par 
ailleurs  recours  a  la  TOO. 

L'annexe  1  indique  le  nom,  te  contenu  et  un 
point  de  contact  pour  chacune  d'elles. 


Solution  N*  5:  Utitiser  des  logiciels 
disponibles  dans  l ’environnement  de  La 
traduction  et  de  la  terminologie:  creation, 
gestion  et  consultation  de  terminologie. 
J'en  citerai  quelques  uns  mais  ia  encore  La 
liste  sera  loin  d'etre  exhaustive: 

-Rquila,  avec  utilisation  possible  sur 
micro  dans  un  eventail  de  IS  langues, 
distribue  par  La  Maison  du  Oictionnaire 
(France) , 

-BBTRO,  pour  la  gestion  des  bases  de 
donnees  lexicales,  distribue  par 
B 'Vi  tat (France) . 

-Alexis,  de  GSI-ERLI,  permettant  de 
naviguer  entre  des  termes  et  des  concepts, 

-  Ink  Textoals  et  Term  Tracer,  distribues 
par  Ink  languages  (France), 

-Lexai  2,  poste  de  travail  pour 
lexicographe,  distribue  par  SEI  (France), 

-Microcezeau  qui  permet  notamment  de 
fusionner  des  banques  de  donnees  entre 
elles  et  d'echanger  des  donnees  avec 
Eurodicautom  dans  de  nombreuses  langues, 
distribue  par  Terminformatique  (France), 

-Termex  pour  la  creation  et  la  gestion  de 
dictionnaires  eiectroniques  avec  un 
programme  compiementaire  Glosnost,  congy 
aux  Etats-Unis  ,  distribue  par  Eurolux 
Computers  (Luxembourg). 

-Phenix:  a  chaque  terme  correspond  une 
fiche  terminologique  reprenant  les 
donnOes  contextuel les  ainsi  que  des 
precisions  grammaticales  et  lexicales 
(frangais,  anglais,  allemand,  espagnol, 
italien),  distribue  .par  SITE  (France) 

-Thesaurus  multilingue  e’ectronique 
distribue  par  Lexitech  Utrecht  (Pays-Bas). 


Solution  N*  S:  Rutres  outils  periphdriques 
du  traducteur: 


-Systeme  bilingue,  qui  permet  l 'usage 
multilingue  des  microordinateurs: 
reconfiguration  du  clavier,  impression  des 
caracteres  nationaux...  distribue  par 
microcoque  Inc. (Canada)  avec _ 

-EGR-Font,  pour  l 'aff ichage  de  caracteres 
nationaux  ou  de  graphes  scientif iques  ne 
figurant  pas  dans  les. caracteres; de  base . 


-Ted,  environnement  de  traitement  de  texte 
specialise  pour  la  traduction:  mise  en 
fenetre  du  texte  source,  du  texte  cible  et 
de  la  traduction,;  etc .. .distribue  par  Ink 
Langages  et 

-Textcount,  logiciel  de  facturation 
automatique  pour  traducteurs,  avec 
comptage  des  mots  ou  des  lignes. . .distribue 
par  Eurolux  Computers  (Luxembourg). 


Solution  N*  7 :  Rvec  la  multiplication  des 
banques  de  donnees,  des  reseaux 
teiematiques  et  des  passerelles  qui  les 
rendent  aisement  accessibles,  on  observe 
que,  dans  l 'environnement  de  la  recherche 
documentaire,  il  y  a  place  pour  des 
solutions  facilitant  l ’identification  de 
l ’ informal  ion  utile  dans  un  contexte 
multilingue.  Il  s’agira  par  exemple  tout 
simplement  d'indsxation  multilingue 
(fichier  Pascal  du  CNRS,  ou  PERINORM  de 
I'Rfnor)  mais  surtout  de  l 'integration  dans 
les  logiciels  de  recherche  documentaire  de 
modules  analyseurs  de  langues,  s'aopuyant 
sur  des  bases  de  connaissances 
multilingues  et  permettant  en  quelque 
sorte  l 'indexation  automatique  du  texte 
entre  et  sa  recherche  dans  l 'une  des 
’.angues  acceptees  par  le  systeme.  C’est  le 
cas  de  DRRWIN,  congu  et  distribue  par  la 
societe  cdrr  (France).  On  peut  ainsi,  sans 
connaitre  la  Langue  du  corpus  documentaire, 
interroger  ce  corpus  dans  une  autre  langue 
et  obtenir  des  resultats  plus  precis  et 
pertinents  que  ce  que  permet  une  recherche 
de  type  boolean  a  partir  d'une  indexation 
s’appuyant  sur  un  thesaurus  multilingue  et 
des  uperateurs  de  proximite. 

Peuvent  6tre  compris  dans  ce  type  de 
solution  les  logiciels  de  roulage  de 
messages  qui  operent  par  detection  des 
concepts  correspondant  a  des  dest inataires 
et  utilisant  eux  aussi  un  analyseur 
comparable  a  celui  que  l 'on  retrouve  dans 
les  systemes  de  TRO. 

Solution  H*  ft:  C'est  la  possibilite 
oiferte  a  un  traducteur  independant  ou  a 
une  entreprise  de  mettre  en  place  en 
interne,  et  dans  les  limites  de  son 
domaine  d’activite,  un  systeme  d'aide  a  la 
traduction  sur  microordinateur, 
c’est-a-dire  en  utilisant  un 
invest issement  qui  aura  deja  ete  fait  par 
ailleurs,  par  exemple  pour  le  traitement  de 
texte,  ou  l 'interrogation  de  banques  de 
donnees  terminologiques ,  ou  d'autres 
applications  telles  que  eelles  qui  sont 
citees  ci-dessus.  Il  s'agit  de  systemes 
tets  que  Rips  ou  wiedner  ou  Bravice,  que 
le  producteur  fournit  avec  un  dictionnaire 
general  et  eventuel lement  des  dictionnaires 
specialises,  et  une  formation  a 
l 'utilisation  du  logiciel.  Bien  entendu,  si 
le  texte  entre  n'est  pas  deja  sur  support 
magnetique  ou  s'll  ne  peut  arriver  par 
transfert  de  fichier  en  ligne 
(teiechargement) ,  on  sera  conduit  a 
adjoindre  au  poste  de  travail  un  lecteur 
optique  assurant  la  reconnaissance  de 
caracteres,  du  genre  Inovatic,  en  prenant 
soin  de  s'assurer  qu'on  beneficiera  ensuite 
systematiquement  des  progrds  realises  sur 
le  logiciel,  car  les  choses  vont  vile  dans 
ce  domaine  et  >1 'on  risque  d’avoir  a  breve 
echdance  une  installation  obsolete . (Voir  en 
annexe  2  les  principaux  logiciels  de 
reconnaissance  de  caracteres  disponibles 
sur  le  marche  frangais) 


On  devra  aussi  savolr  ,qu'une  reaction 
permanente  avec  le  systbme  devra  blre- 
assurbe  afin  de  completer  les -diet ionnaires 
au  fur  et  a  mesure  que  leurs  lacunes  seront 
constatbes.  On  peut  alors  associer  au 
logtciel  TOO  un  logiciel  de  traitement  ou 
de  gestion  ou  de  navigation  dans  une  base 
syntaxico-lexicale  si  t’on  veiit 
perfectionner  le  systbme-et  ne  pas  s'en 
tenir  a  des  traitements  trop  sommaires. 

Un  pi'balable  indispensable  sera  aussi  de  se 
renseigner  auprbs  du  fournisseur  du 
systbme,  et  aussi  auprbs  d'autres 
utilisateurs  de  ce  systbme,  qui  ont  pu 
dbvelopper  eux  m6mes  des  outils  analogues 
et  qui  seraient  intbressbs  par  une 
cooperation  pour  reduire  leurs  propres 
coOts. 

Enfin  il  Taut  savoir  aussi  que  des 
documents  trbs  courts  (quelques  pages),  non 
numbj-isbs  prealablement ,  conduisent  a  un 
ensemble  de  manipulations  qui  ont  pour 
rbsultat  d’abaisser  la  productivite  et  de 
rendre  a  terme  contestable  le  recours  a  la 
TOO  qui  au  contraire  se  justifie  pleinement 
si  l 'on  ambnage  le  poste  de  travail  en 
veillant  a  son  ergonomie. 


Solution  N»  9:  C’est  Celle  qui  peut 

s'appliquer  a  l ' informat  ion  gbnbrbe  par 
l  ’entreprisej  elle  va  de  la-publicitb  a  la 
documentation  technique  accompagnant  les 
produits.et  services.  C'est  I’ensembte  des 
flux  d 1  informat  ion  sortants.  Cette 
information  a  ceci  de  particulier  qu'elle 
couvre  un  secteur  dblimitb,  bien  maitrisb, 
ou  l 'an  est  orfbvre  ou  expert  et  done  tout 
a  fait  capable,  de  dbf  inir  et  de  contrbler 
le  sous-ensemble  lexical  nbcessaire  et 
sufrisant,  et  bventuellement  mfime 
constitucr  un. sous-ensemble  syntaxique,  en 
relation  avec  un  guide  de  style  ou  de 
redaction.  II  est  probable  qu’il  existe 
dbja  dans  I'entreprise  toute  une  chalne 
d’bdition  passant  par  la  numerisation  et  un 
ensemble  de  contrOles.  II  est  possible 
aussi  que  I'entreprise  ait  a  proteger 
une  partie  de  sa -production  documentaire 
et  que  des  problbmes  de  conf ident ial i te 
existent  bien  que  l ’on  ait  a  traduire, 
dans  le  cadre  par  exemple  d'accords  de 
cooperation  internet ionale .  Dans  un  tel 
cadre  de  besoin,  la  TflO  devra  btre  un 
outil  interne  capable  de  s'intbgrer 
facilement  dans  un. processus  d'bdition  et 
devra epouvoir  accepter  des  dictionnaires 
constitubs  pour  des  besoins  internes.  Cette 
capacitb  a  s'intbgrer  pourra  alors 
constituer  un  crit*re  de  choix  important, 
nu-deia  des  petits  systbmes  tels  que 
wiedner  qui  sont  insuffisants  vis-a  -vis 
de  gros  volumes,  on  pourra  done  envisager 
d'implanter  en  interne  un  systbme  de 
traduction  plus  puissant,  un  tel  projet  ne 
peut  btre  bconomiquement  viable  que  s’il 
est  btudib  en  concertation -par  un 
groupement  d’utilisateurs,  et  a  cet  bgard 
1 'exemple  du  CIGREF  (Club  Informatique  des 
Grandes  Entreprises  Frangaises)  est 
extrbmement  interessant  parce  qu'il 
apporte,  vis-a-vis  du  concepteur  oil  du 
distributeur  de  systbme,  un  poids  suffisant 
pour  obtenir  les  ambnagements  souhai tables , 
et  dbf inir  en  commun  une  doctrine  de 
dbveloppement  et  d 'util isat ion  de 
I’outil  TOO  intbressant  la  col lect ivi tb . 


Solution  N*  10 :R  l 'inverse  du  cas  qui 
precede,  il  existe  un  autre  type  de  besoin 
qui  concerne  les  flux  d' inf ormat ion 
entrants ■  En  prt icul ier  l 'interrogation 
des  banques  de  donnbes  textuelles  qui 
aujourd'hui  sont  surtout  de  type 
signalbtique  mais  qui  de  plus  en  plus 
offrent  un  accbs  au- text?  integral. 

L ‘ut i l isateur  a  besoin  d.  pouvoir  faire  un 
balayage  rapide  de  ce  contenu  textuel  pour 
identifier,  a  partir  d'une  recherche  en 
ligne  ou  d’une  diffusion  selective  de 
l *  inf ormat ion  ;n  ligne,  etablie  sur  son 
•.prof  1 1*  d 'act  ivi  tb  des  informations  qui 
seront  tantOt  dans  sa  langue,  tantfil  dans 
diverses  langues  btrangbres.  Dans  une 
premibre  btape,  il  est  place  en  face  de 
rbsumbs,  gbneralement  de  langue  anglaise, 
provenant  d’un  ou  plusieurs  serveurs 
d ' inf ormat ion.  On  peut  alors  intbgrer  au 
niveau  d'une  passerelle  ou  'gateway*  une 
possibilitb  d'accbs  en  ligne  a  un  serveur 
de  traduction  assistbe  pour  presenter  a 
cet  utilisateur,  lui-mbme  expert  dans  le 
domaine  considerb,  une  traduction  brute 
dont  il  pourra  gbnbralement  se  contenter, 
en  attendant  de  pouvoir  faire  traduire 
avec  plus  de  soin  le  document  primaire 
qu'il  aura  ainsi  pu  identifier  plus 
facilement  que  si  la  base  consultbe  est 
dans  une  langue  qu'il  ne  connait  que  trbs 
mal .  Le  CEDOCUR  a  entrepris,  sur  ce  type 
de  besoin  et  de  solution,  des  essais  avec 
Systran,  oO  l 'on  traduif  en  essayant  de 
regrouper  aussi  bien  les  volumes  que  les 
transactions.  On  peut  bien  entendu  imaginer 
que  l 'ensemble  de  la  banque  de  donnbes 
soit  mis  en  traduction  par  son  producte 
c'est  lb  le  sujet  d'une  etude  bconomiqui 
qui  reste  a  fairs. 


Solution  N*  11  II  est  arrive  qu’une 
entreprise  ne  trouve  pas  de  systbme  de 
traduction  correspondant  a  son  besoin,  en 
I’espbce  le  besoin  de  produire  une  banque 
de  donnbes  bibl iographique  en  plusieurs 
langues  et  de  pouvoir  I’interroger  dans 
ces  diffbrentes  langues.  L’entreprise  peut 
alors  erber  elle-m§me  son  propre  systbme  de 
traduction  automatique,  puisqu'il  ne  s'agit 
plus  ici  de  TRO.  C'est  un  acte  de  foi  mais 
il  n'est  pas  interdit  de  penser  qu'un  tel 
systbme  puisse  intbresser  d'autres  secteurs 
industriels  ot)  l 'on  travaille  aussi  dans  un 
contexte  multilingue,  a  la  production  d'une 
banque  de  donnbes  en  commun,  auquel  cas 
seule  la  base  lexicale  serait  a  revoir.Le 
cas  de  I'Institut  Textile  de  France  auteur 
de  TITU5,  opbrationnel  depuis  plusieurs. 
annbes  matgrb  les  contraintes  imposbes  aux 
rbdacteurs,  mbrite  une  pause. 

TITUS  vogue  vers  une  version  v  qui  sera 
incessamment  eR  service,  oO  ces  contraintes 
seront  trbs  faibles  et  tout  a  fait 
acceptables. 

Choix  d'un  systbme  de  TPO. 

Les  critbres  qui  entrent  en  ligne  de 
compte,  avec  une  pondbration  qui  reste  a 
determiner,  sont  les  suivants: 

-niveau  d ' intel l igibil i tb  du  rbsultat  brut, 
en  gbnbral,  et  dans  le  secteur  .considerb , 
si  l 'on  est  dans  urie  activitb  sectorielle. 

-couples  de  langues  acceptbs  par  le 
systbme,  directement  ou  par  une  autre 
langue  interposbe. 
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-vitesse  de  traitement. 

-volumes  a  traduire  Cdans  la  situation 
prysente  d'une  part,  dans  I'hypothese  de 
t 'utilisation  de  la  TOO  d'autre  part)  ce 
point  etant  bien  entendu  en  relation 
directe  avec  le  critere  qui  precede. 

-volume,  quality  ,  accessibility  et 
facility  de  mise  a  jour  et  de  correction 
des  outils  linguistiques  (dictionnaires , 
thesaurus,  bases  de  connaissances)  et 
possibility  de  navigation  entre  ces  nutits 
en  foction  du  contexte  (intelligence 
artificielle) . 

-possibility  d'utiliser,  en  sous-produit  de 
la  mise  a  jour  des  dictionnaires,  un 
produit  de  paramytrage  sur  support 
magnetique  rbutilisable  yventuel tement  dans 
le  cadre  de  Involution  d'autres  systames 
de  TRO,  mfime  concurrents. 

-facility  d‘ integration  dans  la  chalne  de 
traitement  documentaire. 

-compatibility  avec  I'aquipement 
informatique  existant  et  le  reseau  de 
transmission  de  dannyes. 

-autres  utliisateurs  du  systame,  et 
eventuality  d'une  association  avec  eux. 

-aspects  confidentiality. 

-nombre  et  niveau  de  qualification  des 
personnels  associes  au  fonctionnement  du 
systame,  y  compris  raviseurs.  CoOt  de  la 
formation  nbcessaire  dans  chaque 
qualification. 

-ergonomie  du  systame  et  niveau 
d'acceptabilite  par  les  traducteurs  et  par 
l 'utilisateur  final  (s’il  n'y  a  pas  phase 
de  revision). 

-possibilites  d 'apprent issage  et  de 
perfectionnement  du  systame.  (organisation 
de  l 'enrichissement  des  dictionnaires, 
niveau  de  complexity  et  de  coQt  de  cet 
enrichissement) . 

-possibilites  de  prise  en  compte  des 
corrections  syntaxiques. 

-risques  d'interfarence  et  de  perturbations 
entre  plusieurs  utilisateurs  du  systame. 

-prix  d’acquisition  ou  o'utilisation  du 
systame,  et  valorisation  des  apports  de 
l 'utilisateur,  par  exemple  dans  le  cadre 
de  la  constitution  et  de  involution  de 
dictionnaires  susceptibles  d'etre  utilises 
par  d'autres  (redevances  ou  ristournes). 


-evaluation  comparative  du  coOt  TRO  et  du 
coflt  traduction  sans  TRO  pour  100  mots  et 
des  avantages  et  inconvenients  de  chaque 
solution  (volumes,  deiais)  projetas  sur 
quelques  annaes. 


On  trouvera  ci-joint  un  tableau  (annexe  4) 
qui  presente  les  pr.incipales 
caracteristiques  de  quelques  systbmes 
opyrationnels.  Ce  panorama  des  solutions 
possibles  n'est  probablemerit  pas 
exhaustif .  Pour  alter  plus  loin  on  pourra 
avoir  recoyrs  au. point  de  contact  qui  est 
gynyralement  indiquy.  On  ne  porte  pas  ici 
de  jugement  de  valeur  sur  ces  solutions, 


d'autant  plus  que  I'efficacite  depend 
toujours.de  la  nature  de  l 'application 
et  d'un  environnement  qui  peut  etre  tres 
different  d'une  application  a  I'autre. 

Peut-ytre  est-il  utile  ygalement  de  fournir 
une  indication  sur  les  tarifs  de 
remuneration  pratiques  vis-a-vis  des 
traducteurs.  Les  chiffres  qui  apparaissent 
dans  l 'Annexe  3.  avaient  ete  fournis  en 
1962  par  Loll  Rolling,  de  la  CCE  Luxembourg 
et  auraient  done  a  etre  reactualises.  On 
trouvera  aussi  dans  l ‘annexe  3  des 
elements  de  comparaison  de  coOt  entre 
tr.aduction  machine  et  traduction  humaine. 


Conclusion : 

II  existe  aujourd'hui  un  certain  nombre  de 
possibilites  d'amyiiorer  la  productivity 
en  matiere  de  traduction,  depuis  le  simple 
recours  a  des  dictionnaires  aiectroniques 
ou  autres  outils  linguistiques  instalies 
localement  ou  accessibles  en  ligne,  jusqu'a 
la  TRO  proprement  dite,  en  passant  par  des 
solutions  intermydiaire  ccmme  l 'indexation 
multilingue  ou  Les  analyseurs  de  texte 
s'appuyant  sur  des  bases  de  connaissances 
multilingues  ...Le  souci  de  mise  en  commun 
des  traductions  effectuyes  (World 
Translation  Index)  et  mfme  des  traductions 
entreprises  va  aussi  dans  le  sens  de 
l 'amelioration  de  la  productivity. 

II  faut  yviter  dysormais  l ’obstacle  majeur 
et  le  surcoOt  qu'a  yty  la  saisie  du 
texte,  ce  qui  signifie  qu'il  faut  gynyrer 
le  texte  sur  support  numyrisy,  ce  qui 
aujourd’hui  fort  heureusement  tend  a  se 
gynyraliser.  C’est  bien  entendu  lorsque 
l 'on  est  en  presence  de  texte  deja  numyrisy 
que  des  gains  substantiels  peuvent  etre 
escomptys.  Ceci  signifie  qu'il  faut  se 
tourner  vers  l 'edition  yiectronique  et  ne 
pas  continuer  a  s’en  tenir  trop  longtemps 
encore  au  seul  support  papier. 

II  faut  viser  A  intygrer  la  TRO  dans  la 
chalne  de  traitement  documentaire  et  la 
placer  de  preference  dans  le  service 
d ' inf ormat ion ,  oO  l 'environnement  est  le 
plus  favorable,  ce  qui  permettra  des 
economies  dans  l 'invest issement . 

II  faut  que  les  traducteurs  et  iriterpretes 
soient  plus  etroitement  associes  non 
seulement  comme  utilisateurs  mais  aussi 
pour  apporter  leur  competence  en  matiere 
d'enrichissement  des  contenus  semantique  et 
syntaxique. 


II  faut  veilter  tout  particutiyrement  A 
l 'ergonomie  des  systemes  instalies,  pour 
obtenir  un  confort  d'utilisation  suffisant. 

II  faut  par  ailleurs  se  preparer  A 
l 'industrialisation- de  la  langue  en 
considirapt  que  la  TRO  n’est  qu'une 
applies  ion  d'un  effort  plus  general, 
interessant  d'autres  secteurs  de  la 
communication.  Cela  etant,  la  TRO  va 
pouvoir  benyficier  de  tous  les  progres 
realises- a  d'autres  fins  dans  la  domaine  de 
I'analyse  et  du  traitement  de  la  langue. 

II  faut  enfin  proedder  en  Europe  et  aux 
Etats-unis  a  une  reevaluation  des  enjeux  et 
du  marche  potentiel  pour  he  pas  laisser  le 
champ  libre  dans  ce  secteur  au  Japon  qui  a 
aujourdhui  une  appreciation  toute 


diffarente  de  ce  marche,  de  I’interbt  de  la 
THO  et  de  la  necessity  de  la  faire 
progresser. 

II  faut  aussi  faire  appel  aux  competences 
des  traducteurs  et  interpretes,  aussi  bien 
au  niveau  du  dbveloppement  des  systBmes 
qu'au  niveau  de  leur  utilisation. 

L'enjeu  a  probablement  ete  jusqu'ici 
sous-estimb  aux  Etats-Unis  et  en  Europe. 
Tout  ind.ique  qu'au  contraire  au  Japon  on 
investit  beaucoup  plus  dans  ce  secteur  non 
seulement  parce  que  l 'on  espbre  exporter 
des  syst&mes  de  TfiO  mais  surtout  parce 
qu’on  voit  dans  la  TOO  la  seule  maniere  de 
rbduire  tres  sensiblement  l 'obstacle  de  la 
langue,  tant  pour  s'informer  que  pour  se 
faire  connaitre. 

En  attendant  que  des  systfimes  6volu£s  tels 
qu 'EUROTRfl  voient  le  jour,  probablement 
dans  cinq  ou  six  ans  au  mieux,  il  importe 
de  satisfaire  la  demande  de  traduction, 
aujourd’hui  de  plus  en  plus  pressante. 
Chaque  utilisateur  ne  peut  pas  &  lui  tout 
seul  faire  tout  l ’effort  ndcessaire  pour 
enrichir  les  systfimes  existants.  II 
convient  done  d'opbrer  des  regroupements 
d'utilisateurs  pour  sfilectionner  le  ou  les 
systemes  qui  mfritent  d'etre  enrichis .Dans 
la  mesure  oO  il  faudra  encore  entrer  des 
dictionnaires ,  ces  utilisateurs  peuvent 
faire  ensemble  le  choix  de  ces 
dictionnaires,  en  priviiegiant  lb  encore 
ceux  qui  existent  deja  sur  support 
magnetique,  et  en  recherchant  une 
methodologie  permettan'.  un  parambtrage 
independant  du  systbme  de  traduction,  de 
fajon  que  le  rfisultat  de  cet  investissoment 
soit  utilisable  pour  d'autres  systdmes 
Oventuel lement . 
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Annexe  l: 

Banques  de  donnees  terminologiques 


Nom 


Contend 


Eurodicautom  >  370000  termes  et  expressions  contextuelles 

>  90000  abbreviations 
mise  0  jour  mensuelle  (2000  entrees) 
manuel  gratuit  en  anglais  et  en  frangais 

Serveur  Echo,  15,  Av.  de  la  Faiencerie,  L  1510  Luxembourg 

Tel  +3522076L 


Normaterm  100000  termes  frangais  et  anglais  extraits  de 

normes  frangaises  et  Internationales  et  ae 
textes  rOglementaires.  Acces  par  le  frangais 
ou  1’ anglais 

(definitions,  synonymes,  termes  gendriques 
et  specifiques,  renvois,  indication  ae 
sources. . . > 

messagene  associbe  -  sur  3617  code  normaterm. 

AFNQR,  Tour  Europe,  Cedex  7  920B0  Paris  La  Oefense 

Tel  (1)42915613 


Termdok  Sur  CD-Ro-  ,  accessible  par  PC,  multilingue 

(anglais  frangais,  subaois,  cllemand, 
norvegi  .n,  danois,  fienois  et  russe) 

225000  termes  avec  definitions 

regroupe  7  banques  de  donnees  terminologiques 

(Normaterm,  Termium,  TNC,  Tepa,  ...i 

Walters  Lexicon  C°,  SS  dermalmstorg  8,  17800  Stockholm,  Suede. 

tel  +46(  08  >439510  USJ.  920  ou  AFN0R  6500FF  HT 


Termnet  International  Network  for  Terminology 

Production  et  diffusion  de  publications  et  de 
produits  et  services  oans  ie  secteur  ae  la 
termmologie  a  1’echelle  Internationale 

Heinestrasse  38,  A-1020  Vienne,  Autriche. 


Termium  Pour  verifier  et  normaliser  la  terminologie 

dans  les  deux  langues  du  Canada 
Pour  aiaer  les  traducteurs  dans  leur  travail 

Universite  de  Montreal  ou  Rdgie  de  la  Langue  Frangaise  du  Quebec. 


T0B  Terminology  data  bank 

Integree  oans  un  systdme  d’aiae  aux  traaucteurs 
et  d’aide  au  developpement  de  la  terminologie 

Carnegie  Mellon  University 


ANNEXE  2 


Principaux  logiciels  d’OCR  sur  le  marchp  frangais 
(d’aprds  01  Informatique  -  N°  1084) 


Logiciels 

Editeurs/distributeurs 

Prix 

Autoread 

monascanner 

multiscanner 

ISTC 

6950  FF 

8950  FF 

Accatex 

Datacapy/Alphasystem 

9950  FF 

Cognicar 
module  1 
module  2 

Cognisof t/Micropros 

20000  FF 

13900  FF 

Discover  9320 
module  10 
module  30 

Kurzweil/Penta  System 

8000  FF 

66000  FF 

Image-Read 

image-in 

CPI/MTE 

4900  FF 

K  5100 

freedom 

Kurzueil/Penta  System 

140000  FF 

38000  FF 

OCR  + 

Oatacopy/Donatec 

8950  FF 

Omnipage  2.0 

Caere/softmart 

9150  FF 

Readstar  Express 
Readstar  0 

Readstar  2  + 
Readstar  3  + 
Readstar  6 

Inova tic 

9950  FF 

4990  FF 

20000  FF 

40000  FF 

75000  FF 

Readright  2.0 

OCR  System/Canon 

4400  FF 

Recognita 

Recognita  + 

S2KI/Apsylog 

11900  FF 

10900  FF 

Scaned 

Calera/mentor  Graphics 

50000  FF 

Texiris  2 

Texiris  2  + 

Iris/LCE 

49950  FF 

40000  FF 

Text  Pert  3.0 

CTA/P  Ing6ni6rie 

9900  FF 

Annexe  3:  Remuneration  des  traducteurs 


(d’opres  Loll 

Rolling, 

CCE, 

Luxembourg,  1902 ) 

PAYS 

FF/100  mats 

Lang. 

Europ. 

Lang.  Exot. 

Etats-Unis  A.T.A. 

Free-lance 

8,5 
15  - 

25 

Grande  Bret. 

IS  - 

30 

Belgique 

20  - 

20 

36  -  00 

Canada 

21  - 

64 

Suisse 

33 

France 

35  - 

50 

65  -  100 

Suede 

70 

Allemagne 

65  - 

100 

Elements  de  eomparaison  ( 1907  ) 


traduction  brute...... 

humaine. . . 

. . . .  •  160.  . . . . . . . 

Annexe  4:  PRINCIPALES  SOLUTIONS 


Norn  Utilisation 


Contact 


Svstemes  autonomes 


Systran  13% 


USAF,  Xerox,  GM, 
U7C  Canaaa,  OT AN, 
CCE,  Dornier,  IONA, 
KFK,  Aerospatiale, 
CEDOCAR 


M.  Loll  Rolling 
K.  I.  Pigott 
ou  Gachot  S,  A.  26  bis 
Av.  ae  Pans  B.  P.  14 
95230  Soisy  s/s  Mont¬ 
morency  -  France 


130000  lignes  de  programmes  par  iangue,  100000  regies,  500000 
mots/heure(  thdorie ),  80000<  pratique  >,  IBM4331,  5  Gigaoctets,  20 
centimes/mot( 50FF/page ) 


Logos  262  CEE  Luxemoourg,  Logos  Corp. 

Nixdorf,  Opel,  Siemens  1,  Dedham  Place, 
Merceaes,  P tall.  Burroughs  Dedham,  Ma. 

02026  USA 


Ang-vietnamien,  frangais,  allemana,  espagnol 

IBM,  Wang,  Unisys... 30  di  40  centimes/mot,  y  compns  amortissement 
sur  cinq  ans. 


Metal  Siemens 

All-Angl,  Angl-Ali. 


PKI  Philips  r.ammunica- 
tions  Industries 


B’ VITAL  (Anane)  SITE  (r ranee  i  M.  Pelletier  CIGREF 

21,  Rue  de  Messine, 
7500S,  Paris 

1,5  million  operations  /  mot 

en  cours  d’ industrialisation,  IBM  43XX,  30XX,  93XX 


Svstemes  a  svntaxe  ccntrOiee 
TAUM 

TITUS  ITF  M.  3.  M.  Oucrot 

Agriculture  Institut  Textile  oe 

troplcale  France,  23,  AV.  A. 

Allied  Chemicals  Bnand,  BP  141  92223 

Sagneux  CEOEX  France 

Fran,  angl,  all,  esp.  Temps  ae  reaaction  augmente  de  101  pour 
ecriture  en  langage  TITUS.  IBM  Origine:  prof.  BaKer  USA. 


Svstemes  interact! fs 

Ueidner  23%  Marine,  Aerospatiale  TAO  International,  37 

ter,  Rue  oe  Met: 

31000  Toulouse  France 

Transactivet ALPS)  OTAN 
122 


Ericsson  16% 


13  Universites  europ.  45  XECUS 

CMU  ( reconnaissance  ae  la  parole ) 

Interp.  teieph.  Ang-Dap  4  Milliards  FF 

Electronic  dictionary 


Grands  pro  lots; 

E'JROTRA 

CMT  dE.U) 

ATR 

EDR 


1,5 


Oleg  LavrolT 
Sociele  Aerospatiale 

Chef  du  Centre  d'Information  Documcntairc  Conimun 
B.P.  No.  76, 92152  Surcsnes  Cedcx 
France 


Resume 

L'expose  presente  des  textes  bruts  traduits  Si  1'aide  de  la  machine 
(Traduction  Automatique)  et  les  textes  post-ddit4s  (en  version 
affinde)  et  indique  les  temps  passes  par  un  traducteur  utilisant  le 
systbme  SYSTRAN.  Un  tableau  rdcapitulatif  fournit  les  temps  et  les 
coOts  de  traduction  effectude  en  TA  et  met  en  dvidence  les  gain  de 
productivitd  obtenus  par  rapport  Si  une  traduction  totalement  humaine. 
Les  rdsultats  prdsentds  ne  concernent  que  des  traductions  pour 
1 esquel les  la  responsabilitd  juridique  des  Socidtds  n’est  pas  engagee. 


1.  Considerations  gdndrales 

La  comparaison  entre  des  traductions  effectuees  par  un 
traducteur  humain  et  celles  obtenues  par  la  TA  (Traduction 
Automatique)  souldve  toujours  des  poldmiques  passionnees  entre  les 
traducteurs  "classiques"  (refusant  la  machine  en  tant  que 
traducteur)  etles  traducteurs  "nouvelle  gdndration"  (tirant  un 
profit  maximal  de  la  machine). 

Si  Ton  demande  Si  un  groupe,  composd  uniquement  de  traducteurs, 
de  choisir  la  meilleure  des  traductions  d’un  texte  effectuees  par 
plusieurs  traducteurs  compdtents  dont  Tune  traduite  par  la  machine 
et  suivie  d'un  traitement  de  post-edition  affinde  realise  par  un 
adepte  convaincu  de  la  TA,  1'expdrience  montre  que  ce  groupe  se 
trouve  dans  1'incapacitd  de  sdlectionner  ce  texte,  voir  meme 
d'indiquer  celui  traitd  par  la  machine. 

Au  stade  actuel  de  revolution  des  systbmes  de  TA,  on  peut 
affirmer,  ci  condition  de  regler  les  problemes  de  terminologie, 
d'analyser  systematiquement  les  textes  traduits,  d'entretenir  un 
dialogue  permanent  avec  les  concepteurs  de  systbmes  et  par  ailleurs 
de  raire  appel  des  traducteurs  convaincus  par  la  TA,  que  la 
qualite  de  traduction  obtenue  Si  l'aide  de  la  machine  et  suivie 
d'une  post-edition  affinee  est  d'un  niveau  identique  ci  celui  obtenu 
par  une  traduction  totalement  humaine.  II  faut  cependant  bien 
distinguer  les  domaines  d * appl i cations  possibles  et  les  necessites 
eventuelles  d'integration  des  systSmes  dans  les  Societes. 


A 
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Dans  la  comparaison  des  textes  traduits  par  un  traducteur  humain 
ou  &  1  'aide  de  la  machine  on  ne  peut  que  comparer  les  r^sultats 
obtenus  aprbs  une  post-edition  affinee  qui ,  par  definition,  doit 
etre  equivalente  ct  cel  1  e  de  la  traduction  humaine.  Par  consequent, 
les  exemples  presentes  au  cours  de  cet  expose  ne  concernent  que  le 
texte  source,  la  traduction  texte  machine  et  la  post-edition 
affinee  (la  traduction  humaine  ne  pouvant  etre  que  differente  d'un 
traducteur  &  un  autre). 

Pour  comparer  en  toute  objectivite  les  deux  modes  de  traduction, 
il  faut  imperativement  faire  appel  dans  le  cas  de  la  TA  Si  des 
traducteurs  motives  et  objectifs  et  visant  k.  obtenir  une  traduction 
de  qualite  humaine.  II  faut  par  ailleurs,  leur  fournir  tous  les 
outils  adaptds  S  leurs  besoins  (traitement  de  texte  convivial, 
recherche  terminologique  integrde  au  poste  de  travail,  modem  de 
connexion  automatique  au  serveur  du  logiciel,  etc  ...) 

Pour  les  textes  necessitant  un  trfes  haut  niveau  de  qualite,  il 
est  necessaire  pour  la  TA  de  faire  intervenir  d'autres  facteurs 
techniques  &  mettre  en  jeu  avant  le  lancement  des  traductions. 
Ainsi,  par  exemple,  il  faut  faire  appel  &  un  correcteur 

orthographique  (langue  source),  clarifier  les  ambiguYtes,  re^crire 
si  necessaire  les  phrases  trop  longues  et  complexes,  ressortir  la 
terminologie  inconnue  dans  le  systeme.  On  arrive  ainsi  a  definir  en 
amont  de  la  traduction  des  procedures  de  travail  &  respecter  lors 
de  la  redaction  des  textes.  A  ce  jour,  un  bon  nombre  de 

special istes  travail  lent  dans  ce  domaine  en  tenant  compte  du  fait 
que  de  plus  en  plus  les  rddacteurs  tentent  de  rediger  directement 
dans  la  langue  cibl.e.  En  consequence,  la  comparaison  des 
coGts/deiais  ne  portera  que  sur  des  textes  dits  "d' information 
courante"  devant  etre  traduits  et  fournis  rapidement. 

Cette  denomination  englobe  d'une  part,  la  notion  "connaissance 
de  Tinformation"  pour  laquelle  on  peut  estimer  qu'a  50  1  des  cas 
une  traduction  TA  avec  post-edition  minimale  est  largement 
suffisante,  et  d'autre  part,  les  textes  diffuses  ci  1  'exterieur  des 
Societes  mais  n'engageant  pas  en  general  leur  responsabilite 
juridique. 

Compte  tenu  de  certains  aspects  techniques  de  realisation  de  la 
documentation  technique  des  Aprbs-Vente  il  parait  del i cat  &  ce  jour 
d'utiliser  la  TA  dans  ce  domaine,  ci  moins  de  disposer  de  1  ogi  ci  el  s 
pouvant  facilement  et  economiquement  s'intdgrer  dans  les  sites 
opdrationnels  des  Societes. 


2.  Textes  de  comparaison 

On  trouvera  en  Annexe  1  trois  textes  de  comparaison  permettant 
d'illustrer  le  tableau,  des  coOts  et  deiais  de  traduction  obtenue 
par  la  TA  et  effectuee  par  un  traducteur  humain. 


Le  premier  texte  est  un  extrait  d'un  compte  rendu  du  Technical 
Committee  on  Technical  Information  de  l'AIAA  (traduction  de 
1 'anglais  vers  le  franjais). 

Le  deuxi&me  texte  est  un  extrait  d'une  note  technique  traitant 
des  techniques  de  control e  non  destructif  (traduction  du  frangais 
vers  1 'anglais). 

Le  troisi&me  texte  est  une  note  provisoi/e  de  travail  relative  ik 
la  preparation  de  notre  cycle  de  conferences  (traduction  du 
franfais  vers  1'anglais). 

Dans  les  trois  cas  la  post-edition  presentee  est  une 
post-edition  aff i nee-  Les  textes  ont  ete  traduits  Si  l'aide  du 
systeme  SYSTRAN  &  partir  d'un  poste  de  travail  (micro  type  IBM  PC) 
implante  dans  une  soci ete . 

II  est  evident  que  le  temps  de  post-edition  affinee  varie  d'un 
texte  &  un  autre  et  ik  1 '  1  nteri eur  meme  du  texte,  en  fonction  des 
domaines  traites,  de  1 'absence  de  terminologie  dejct  codee  et  en 
fonction  de  la  redaction  des  textes  sources.  En  consequence,  h.  ce 
jour,  le  temps  total  de  traduction  indique  sur  ces  exemples  reflate 
un  traitement  minimal  dans  le  meilleur  des  cas.  II  peut  se  produire 
des  cas  ou  le  temps  de  post  edition  affinee  d'un  paragraphe  est 
superieur  au- temps  de  traduction  effectuee  par  un  traducteur  humain 
(en  moyenne  generale  250  &  300  mots  ci  l'heure,  selon  les 
difficultes  rencontrees). 


Afin  de  raisonner  en  dehors  de  tout  contexte  monetaire  les 
informations  economiques  sont  fournies  &  partir  des  hypotheses  et 
references  suivantes  pour  une  page  de  250  mots  : 

-  IiaMUmUmins 

.  Temps  :  1  heure  (frappe  comprise) 

.  CoQt  :  reference  de  base  100 

•  Reconnaissance  de  caracteres 


-  Temps  :  3,5  minutes 

-  CoOt  OCR  :  2,6  %  par  rapport  S  la  reference  de  base. 
•  Transmission,  traitement  et  reception 


-  Temps  :  1 ,5  minute 

-  CoOt  :  37,5  X  par  rapport  ct  la  reference  de  base. 

•  Post-edition 


.  minimale 

-  Temps  :  10  minutes 

-  CoOt  :  12,5  X  par  rapport  ci  la  reference  de  bas°. 


.  affi nee 

-  Temps  :  33  minutes 

-  CoOt  :  41,2  X  par  rapport  &  la  reference  de  oase. 


10-4 


•  Recapitulate 


Traitement 

TemDs 

CoOt 

Post-edition 

minimale 

15  minutes 

52, 6%  de  la  ref.  de  base 

Post-edition 

affinee 

38  minutes 

81 ,3%  de  la  ref.  de  base 

Ces  valeurs  sont  issues  d'un  bureau  de  traduction  utilisant  le 
systbme  SYSTRAN  &  parti r  d'un  poste  de  travail  (micro  type  IBM  PC) 
connectable  ci  un  serveur  extdrieur.  Les  statistiques  sont  bashes 
sur  environ  1000  pages  portant  sur  des  domaines  techniques, 
dconomiques  et  de  politique  i ndustri ell e - 

A  ce  jour,  ce  bureau  traduit  plus  de  50  %  des  textes  traduits  en 
interne  &  Taide  de  la  T.A.  et  participe  d'une  manifere  tr&s  active 
I  1' amelioration  du  systfcme  en  transmettant  au  concepteur  une 
analyse  systdmatique  de*  textes  traduits. 

Ces  premiers  r^su  .ats  operationnels  trfes  encourageants  nous 
permettent  d'etablir  le  tableau  recapitulate  suivant  et  de  dresser 
un  diagramme  previsionnel  de  revolution  des  coQts  de  la  traduction 
automatique.  Les  valeurs  du  tableau  sont  donnees  pour  un  lot  de 
traitement  de  10  pages  (limite  actuelle  de  transfert  permettant  de 
recevoir  en  ligne  les  traductions  brutes  machine). 


OCR 

Transfert  et 
Traduc.  machine 

Post-ed 

NMiBSssS 

Total 

minimale 

affinee 

IJSMHSIEIH 

P-edi 

t.affin. 

irnm 

mm 

mm 

Temps 

35  mn 

15  mn 

lh  40' 

5h  30' 

2h  30' 

15mn 

6h20 ' 

38  mn 

CoOt(l) 

m 

37,5  % 

12,5  % 

41,2  % 

- 

52,6% 

81,3  % 

(1)  par  rapport  ct  une  reference  de  base  traduction  humaine  : 
.  Temps  :  250  mots  par  heure 
.  CoOt  :  reference  100 


I 

} 

f 

l 


} 
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THE  ECONOMIC  FACTOR 


A.  neduc-fcd-on  in  the  costs  and 
•tedLma  of  translation 


X .  Translation  exists  and  time  for  infomnat i  V  £ 
texts  <  mini  rmjnn  post- ©di.-fcdLncy ) 


i 


T  and  C 
100% 


1970 

Translators  + 
dictionaries 


T  O 


50% 


1989 

Machine  translation 
systems  under 
development  + 
translators 


T  C 

30% 


1991 

Machine  translation 
systems  + 
post-editors 


2  -  Translation,  exists  and  time  for  documents 
to  be  dispatched  abroad  C  refined  post¬ 
editing  ) 


T  and  C 


100% 

I 


1970 

Translators  + 
dictionaries 


T  C 

80% 


1989 

Machine  translation 
systems  under 
development  + 
translators 


T  O 


50% 

45% 


1995 

Machine  translation 
systems  + 
post-editors 


t 

i 


6 


L* Investl ssement  financier  d'un  poste  de  travail  complet  (voir 
les  schemas  ci-aprfes)  exprime  en  francs  frangais  (:T)  est  le 
sulvant  : 

.  Un  scanner  et  son  logiciel  de  reconnaissance  de  caract&res  : 

~  80  000  FF 

.  Un  micro  type  IBM  PC  comprenant  un  logiciel  de  traitement  de 
texte  convivial,  une  carte  EGA,  des  cartes  modem  de  liaison  : 

~  50  000FF 

.  Une  imprimante  laser  : 

~  20  000  FF 


.  Investi ssement  total  pour  un  poste  de  travail  : 

~  150  000  FF 

L'investissement  pour  plusieurs  traducteurs  est  moins  important 
du  fait  que  le  scanner  et  1' imprimante  peuvent  etre  partages  par 
les  utilisateurs.  Ainsi,  pour  un  exemple  de  5  traducteurs 
l'investissement  par  poste  sera  de  l'ordre  de  70  000FF.  Dans  ces 
conditions,  1 'amorti ssement  financier  pourra  etre  realise  en  un  an 
ou  sur  deux  anndes  environ  pour  un  seul  poste. 


4.  Conclusions 


On  peut  estimer  b  ce  jour  que  l1  introduction  de  la  Traduction 
Automatique  dans  les  bureaux  de  traduction,  Si  condition  de 
satlsfaire  b  toutes  les  exigences  humaines  et  materiel les,  et  de 
promouvoir  correctement  cette  nouvelle  technique  de  traduction, 
devrait  permettre  i  court  terme  d'ameiiorer  notablement  la 
productivity  de  ces  bureaux. 

L'avfcnement  de  la  TA  nous  am&ne  &  reddfinir  les  taches  du 
traducteur  et  ci  transferer  vers  les  secretariats  des  travaux  qui  ne 
necessitent  pas  la  competence  des  traducteurs  (par  exemple,  la 
reconnaissance  des  caract&res).  Ainsi,  les  systfcmes  de  TA 
permettent  4  ce  jour,  pour  des  domaines  et  des  applications  bien 
definies,  de  traiter  environ  deux  pages  ci  l'heure  en  post  edition 
affinee.  Certains  special  1 s tes  ou  concepteurs  de  syst^mes  estiment 
qu'il  est  possible  de  traiter  ainsi  3  pages  ci  l'heure.  Pour  notre 
part,  nous  estimons  que,  dans  1'etat  actuel  des  choses,  2  pages  de 
post  edition  par  heure  nous  semblent  tout  b  fait  rdalisables,  ce 
qui  nous  am&ne  ci  conclure  que,  dans  ces  conditions,,  le  gain 
potentiel  de  la  TA  est  de  l'ordre  de  : 

-  37%  sur  les  temps  de  traitement 


-  207.  sur  les  coOts. 


MODE  DE  FONCTIONNEMENT 
OPERATING  CHART 


Ligne  direcle 
Direct  connection 


OnUINATEUR  CENTRAL 
(stockage,  appel  des  lextes 
el  liaison  avec  les  Stablissemenls) 
CENTRAL  COMPUTER 
-  (Storage,  calling  ol  lexis, 
connections  with  other  sites) 
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Comparison  between  human  and 
automatic  translations 
(quality,  costs  and  processing  time) 


MR.  0.  LAVROFF 


Abstract 


In  this  paper,  rough  machine  translations  or  automatically 
translated  texts  and  post-edited  texts  (resulting  from  a  refined  post 
edition)  will  be  presented  together  with  the  respective  time  a 
translator  devotes  to  translation  when  using  the  SYSTRAN  system.  The 
times  and  costs  of  automatic  translation  are  summarized  in  a  chart 
which  thus  highlights  the  increased  productivity  of  GAT  compared  to  an 
all-human  translation.  The  results  mentioned  hereafter  only  apply  to 
translations  for  which  the  company's  liability  is  not  involved. 


1.  General  overview. 

The  comparison  between  texts  translated  by  a  human  translator 
and  rough  machine  translations  always  raises  an  impassioned 
controversy  between  "classical"  translators  (who  deny  the  use  of  a 
machine  as  translators)  and  the  "new  generation"  translators  (who 
take  the  most  of  the  machine). 

If  a  group  only  consisting  of  translators  is  asked  to  choose  the 
best  translation  of  a  text  among  a  number  of  translations  performed 
by  several  qualified  translators  and  one  text  translated  by  the 
machine  and  thoroughly  post-edited  by  an  advocate  of  automatic 
translation,  the  experience  shows  that  such  a  group  is  unable  to 
select  the  best  translated  text  or  even  to  find  out  the  text 
resulting  -i'rom  an  automatic  translation  system. 

At  the  current  development  stage  of  automatic  translation 
systems,  it  can  be  asserted  that  the  quality  of  a  rough  machine 
translation  thoroughly  post-edited  is  similar  to  that  of  an 
entirely  human  translation,  provided  all  the  problems  related  to 
terminology  have  been  settled,  the  rough  machine  translations  are 
systematically  analyzed,  a  permanent  contact  is  kept  with  system 
manufacturers  and  furthermore,  provided  the  translators  involved 
are  convinced  of  the  benefits  of  automatic  translation  system.  A 
clear  distinction  shall  however  be  made  between  the  possible  fields 
of  application  and  the  possible  requirements  of  integration  of  such 
systems  in  the  companies. 
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Hhen  comparing  texts  translated  by  a  human  translator  with  texts 
translated  with  a  machine,  the  only  relevant  terms  of  comparison 
are  the  results  of  a  machine  translation  followed  hv  a  refined 
post-editing  with  a  quality  equivalent  to  that  of  a  human 
translation.  Consequently,  examples  given  in  this  paper  only 
concern  the  source  text,  the  machine  translated  text  and  the 
refined  post-editing  (bearing  in  mind  that  a  human  translation  is 
different  from  one  translator  to  another). 

In  order  to  compare  objectively  the  two  translation  types, 
motivated  and  objective  translators  have  to  be  called  upon  as  far 
as  machine  translation  is  concerned.  They  have  to  strive  to  achieve 
a  translation  with  the  quality  of  a  human  translation.  Besides, 
translators  have  to  be  provided  with  tools  adapted  to  their  needs 
(user-friendly  word  processing,  integrated  terminology  search 
system  connected  to  a  word  processing,  automatical  modem  cards  for 
the  connection  with  the  host  system,  etc...). 

For  texts  requiring  a  high  quality  level,  other  technical 
factors  have  to  be  examined  before  sending  the  text  for  machine 
translation.  For  example,  a  spelling  corrector  has  to  be  used, 
ambiguities  clarified,  long  or  complex  sentences  rewritten, 
terminology  unknown  to  the  system  identified.  It  is  then  possible 
to  define,  upstream  from  translation,  working  procedures  while 
writing  down  texts.  To  date,  several  specialists  are  working  in 
this  field,  taking  into  account  that  more  and  more  redactors  try  to 
write  directly  into  the  target  language.  Consequently,  the 
comparison  between  cost  and  time  will  only  concern  the  so-called 
"common  information"  texts,  which  require  rapid  translation  and 
supply. 

This  heading  means,  on  the  one  hand,  the  "information  knowledge" 
idea,  where  it  can  be  estimated  that  for  around  50%  of  texts,  a 
machine  translation  with  a  minimum  post-editing  is  largely 
sufficient,  and  on  the  other  hand,  texts  disseminated  outside 
companies  but  for  which  their  liability  is  not  involved. 

If  we  consider  some  technical  aspects  of  after-sales  technical 
documents,  it  seems  to  date  tricky  to  use  machine  translation  in 
this  field,  unless  software  can,  easily  and  without  undue  expenses, 
be  integrated  in  the  companies  operational  sites. 


2.  Comparison  of  texts. 

Appendix  1  shows  3  texts  for  comparison,  which  permit  to 
illustrate  the  cost  and  time  chart  for  machine  translation  and  for 
human  translation. 
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The  first  text  is  an  extract  from  a  Technical  Information 
Technical  Committee  Report  of  the  AIAA  (translation  from  English 
into  French). 

The  second  text  is  an  extract  from  a  technical  memorandum  on  non 
destructive  techniques  (translation  from  French  into  English). 

The  third  text  is  a  background  paper  on  the  preparation  of  our 
Lecture  Series  (translation  from  French  into  English). 

In  all  cases,  the  post-edited  version  shown  is  a  refined  post 
editing.  Texts  were  translated  by  the  SYSTRAN  system  from  a  work 
station  (such  as  an  IBM-PC)  used  within  a  company. 

It  is  clear  that  the  time  required  for  a  refined  post-editing 
varies  from  one  text  to  another  and  inside  the  text  itself, 
depending  on  the  subjects  treated,  the  terminology  already  coded 
and  the  quality  in  writing  of  the  source  text.  Consequently,  to 
date,  the  total  translation  time  indicated  represents  the  minimum 
processing  for  the  best  possible  result.  It  may  happen  sometimes 
that  the  post-editing  time  spent  for  a  paragraph  is  superior  to 
that  of  a  human  tanslation  (that  is  an  average  of  250/  300  words 
per  hour,  depending  on  the  difficulties  encountered). 


3.  Economic  aspects 

In  order  to  leave  aside  any  currency  aspect,  the  economic 
information  are  given  from  the  following  hypotheses  and  references 
for  a  250  word  page  : 

-Human  Juans  lat  ton 

.  Time  :  1  hour  (typing  included) 

.  Cost  :  basic  index  100 

.  Optical  character  reading 

-  Time  :  3.5  min.  '* 

-  Cost  :  2.6%  of  basic  index 

.  Transmission,  processing  and  reception  : 

-  Time  :  1,5  min. 

-  Cost  :  37.5  1  of  basic  index 

.  Post-editing 
.  minimum 

-  Time  :  10  min 

-  Cost  :  12.5  %  of  basic  index 

.  refined 

-  Time  :  33  min. 

-  Cost  :  41.2  %  of  basic  index 
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.  Summary 


Processina 

Time 

Cost 

Minimum  oost-edltino 

15  Min 

52.6  %  of  basic  index 

Refined  post-editing _ 

38  Min, 

81.3  %  of  basic  Index 

Theses  values  have  been  delivered  by  a  translation  bureau  using  the 
SYSTRAN  system  at  a  work  station  (  such  as  an  IBM-PC),  connectable  to  an 
external  host  sytem.  Statistics  have  been  made  from  about  1000  pages, 
concerning  technical,  economic  and  Industry  subjects. 

To  date,  this  bureau  translates  over  50%  of  In-house  translated  texts 
by  machine  and  participates  very  actively  to  the  Improvement  of  the  system  by 
sending  the  designer  a  systematic  analysis  of  translated  texts. 

These  first  operational  results,  very  promising,  enable  us  to  draw  the 
following  summary  chart  and  the  prospective  diagram  of  machine  translation 
development.  The  chart  values  are  given  for  a  batch  of  10  pages  (current 
tranfer  limit  without  hindering  the  on-line  reception  of  the  rough  machine 
translation). 


Transfer  and 

Machine  translation 

pgEHESiB 

no 

_ total _ 

n 

Ulilllll 

KIRK 

■BTTirTTWRSii 

■nnm 

mm 

n mm 

Id- 

35' 

15' 

Ihr  40' 

5hrs  30' 

2hrs30' 

15' 

6hrs20‘ 

38' 

2.6% 

37.5% 

12.5% 

41.2% 

52.6% 

■ 

81.3% 

(1)  with  reference  to  an  average  basic  human  translation  : 
.  time  250  words  per  hour 
.  Cost  :  index  100 
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THE  ECONOMIC  FACTOR 


A  ar^scaxacrtd-on  in  the  costs  sumcL 
time  o.-ff  •tatrsunslSL'td.orTL 


JL  _  TranslationL  costs  and  time  fcxrr  infoccmat 
texts  (  rruLri  n.rrrura  post- eeiUL'fcd-rig' ) 


T  and  C 
100% 


1970 

Translators  + 
dictionaries 


T  C 


T  O 


50% 


Machine  translation 
systems  under 
development  + 
translators 


30% 


20% 


1991 

Machine  translation 
systems  + 
post-editors 


2  -  'liranslatioen  costs  and  time  fear  document 
to  be  dispatdied  abroad  ( refined  post¬ 
editing  ) 


T  eirncl  G 
100% 


1970 

Translators  + 
dictionaries 


T  C 

a  d% 

63% 


1989 

Machine  translation 
systems  under 
development  + 
translators 


T  C 


50% 

45% 


1995 

Machine  translation 
systems  + 
post-editors 


M4 


The  financial  investment  for  a  full  work  station  (see  following 
diagrams)  expressed  in  French  Francs,  is  the  following  : 

.  a  scanner  and  an  optical  reading  software  : 

80,000  FF 

.  a  micro  computer,  such  as  an  IBM-PC,  equipped  with  a 
user-friendly  word  processing  software,  an  EGA  card  and  modem  cards 
for  connection  : 

50,000  FF 

.  a  laser  printer 
20,000  FF 

.  Total  investment  for  one  work  station 
150,000  FF 


The  investment  for  several  translators  is  reduced,  as  the  scanner 
and  the  printer  can  be  shared  between  users.  For  example,  for  5 
translators,  the  investment  per  work  station  will  amount  to  about 
70,000  FF.  In  these  conditions,  the  investment  could  be  amortized 
within  one  year  or  within  two  years  or  so  for  a  single  work  station. 


4.  Conclusion 

To  date,  the  introduction  of  machine  translation  in  translation 
bureaux  should,  in  the  short  run,  provided  all  human  and  material 
requirements  are  met  and  this  new  translation  technique  is 
correctly  promoted,  improve  significantly  the  productivity  of  these 
bureaux. 

The  emergence  of  machine  translation  leads  to  a  redefinition  of 
the  translator's  tasks  and  to  a  transfer  to  secretaries  of  tasks 
which  do  not  require  translator  skills  (for  example,  optical 
character  reading).  Thus,  machine  translation  systems  permit,  to 
date,  for  well  defined  fields  and  applications,  to  process  around  2 
refined  post-edited  pages  per  hour.  Some  specialists  or  system 
designers  believe  that  3  pages  may  be  processed  per  hour.  lie  think, 
however,  that  for  the  time  being,  2  post-edited  pages  per  hour  are 
perfectly  achievable,  which  brings  us  to  conclude  that,  in  these 
cicumstances,  the  potential  gain  of  machine  translation  is  about  : 

-  37  t  of  processing  time 

-  20%  of  cost. 


Appendix  1 
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TEXT  N°1 

"AIAA  Technical  Committe  on  Technical  Information 
Minutes  of  June  13,  1989,  Meeting" 


I  LANGUAGE  PAIR  : 

English  into  French 


II  GLOSSARIES  SELECTED  : 

--  Aviation  and  Space 

-  Legal 

-  Political  Science 


III  SUCCESSIVE  OPERATIONS  OF  THE  MACHINE  TRANSLATION  PROCESS  : 


-  Optical  reader 

— > 

Time  : 

5 

mn 

-  Sending  a  text  for  translation 

— > 

Time  : 

4 

mn 

-  Post  editing  /  Minimum 

—  > 

Time  : 

5 

mn 

/  Refined 

—  > 

Time  : 

15 

mn 

-  Number  of  words  translated 

361 

IV  ANALYSIS  OF  THE  ROUGH  MACHINE  TRANSLATION  : 


a/  Terminology 


Source  text 

Rough  machine  translation 

Human  translation 

to  issue  minutes 

dtablir  des  minutes 

publier  Je  compte 
rendu 

current  (members) 

(membres)  courants 

(membres)  actuels 

information  flow 

Scoulement  de  1' information 

circulation  de  1' in¬ 
formation 

a  topic 

une  mati&re 

un  sujet 

information  rela- 

associations  relives  par 

associations  lides  au 

ted  associations 

information 

secteur  de  1 ' informa¬ 
tion 

companies 

compagnies 

soci^tds 

programs 

regimes 

programmes 

(articles)  high¬ 
lighting  sth 

accentuant  qch 

relatifs  A  qch 

to  be  due  for 
( publication) 

etre  dQ  pour  la  publication 

seront  publics 
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a/  Terminology  (ctd) 


Source  text 

Rough  machine  translation 

Human  translation 

charter 

charte 

statuts 

a  discussion  was 

une  discussion  a  EtE  mainte- 

une  discussion  a  eu 

held 

nue 

lieu 

survey 

aper<?u 

enqufite 

there  was  agree- 

il  y  avait  convention  pour 

il  a  EtE  convenu  de 

ment  to 

display  (areas) 

visualisation 

exposition 

b/  Grammar 


Source  text 

Rough  machine  translation 

Human  translation 

sb  will  be  con- 

qqun  sera  entrE  en  contact 

qqun  sera  charge  de 

tacted  on 

pour 

it  was  agreed 
that 

il  a  convenu  que 

il  a  EtE  convenu  que 

(that  the  TC) 
appoint 

(que  le  TC)  nomment 

(que  le  TC)  nommera 

by  encouraging 

par  des  Evaluations  critiques 

en  encourageant  des 

critical  evalua¬ 
tions 

d'encourager 

Evaluations  critiques 

c/  Defective  analysis  /  prepositions 


Source  text 

Rough  machine  translation 

Human  translation 

Meeting 

REunissant 

REunion 

Since  X  is  the 

Depuis  X  a  lieu  le  (commen- 

Etant  donnE  que  X  est 

(beginning) 

(on  the  use  and) 

cement ) 

(sur  1' utilisation  et)  expE- 

d'utiliser  et  d'expE- 

mailing 

dier 

dier 

(liaison)  to  sb 

(liaison)  E  qqun 

(liaison)  avec  qqun 

on  writing  sth 

sur  Ecrire  qch 

au  sujet  de  l'Elabora- 

input  (from) 

entrE 

tion  de  qch 
les  rEsultats 

if  appropriate 

si  appropriE 

s'il  y  a  lieu 
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e/  Word  order 


Source  text 

Rough  machine  translation 

Human  translation 

Technical  Commit¬ 
tee 

Electronic  Trans¬ 
fer  of  Informa¬ 
tion 

(are  due)  Septem¬ 
ber  7 

scope  statement 
additional  mem¬ 
bers  and  . . .  ven¬ 
dors 

Technique  Comit6 

Transfert  de  1' information 
61ectronique 

(sont)  le  7  septembre  dfl 

le  rapport  de  port6e 
membres  et  fournisseurs 
additionnels 

ComitS  Technique 

Transfert  Electronique 
de  1 ' Information 

seront  publics  le  7 
septembre 

domaine  d 1  application 
ntembres  supplSmentai- 
res  et  fournisseurs 

f/  Unrecognized  words 


Source  text 

Rough  machine  translation 

Human  translation 

%AIAA 

l'AIAA 

%TCs 

Comitfes  Techniques 

■E2SMMI 

%TCTI 

CTIT 
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SOURCE  TEXT  N°1 


AIAA  Technical  Committee  on  Technical  Information 
Minutes  of  June  13,  1989,  Meeting 
New  York 


Since  May/June  is  the  beginning  of  the  TC  cycle,  B.  Lawrence  said 
that  this  second  meeting  of  the  TC  was  appropriate  to  issue 
formal  minutes  with  copies  to  selected  AIAA  headquarters  staff 
(1).  A  list  of  current  TC  members  will  be  submitted  to 
headquarters  ( 2 ) . 

R.  Lewis  suggested  that  workshops  be  sponsored  or  special  studies 
be  undertaken  on  the  processes  of  information  flow  and  specific 
technologies  such  as  CD-ROM  or  electronic  publishing.  H.  Mindlin 
said  that  ASME  has  a  database  committee  which  promotes  this  type 
of  activity  and  represents  different  disciplines;  a  similar  group 
may  work  within  the  AIAA  structure.  The  topic  selected  is 
"Electronic  Transfer  of  information  and  Its  Impact  on  Aerospace 
Research  and  Development".  Membership  lists  from  information 
related  associations  will  be  used  to  target  managers  in  local 
aerospace  companies  and  organizations  ( 8 ) .  AIAA 
headquarters  will  be  contacted  on  the  use  of  a  conference  room 
and  mailing  the  brochures  (9). 

TC  members  discussed  the  importance  of  collaborating  with  other 
TCs  on  programs  and  activities.  It  was  agreed  that  the  TC  appoint 
a  person  as  liaison  to  the  Publications  Committee. 

Article  for  Aerospace  America.  Articles  from  each  TC  highlighting 
the  year's  activities  by  discipline  are  due  September  7  for 
publication  in  Aerospace  America.  The  content  of  our  TC's 
submission  will  be  on  developments  in  aerospace  technical 
information  ( 14 ) . 

TC  Charter.  A  brief  discussion  was  held  on  writing  a  charter  for 
the  TC.  Since  input  from  the  survey  will  help  determine  the  role 
of  the  TC,  there  was  agreement  to  expand  the  scope  statement 
provided  by  B.  Lawrence  in  her  letter  of  March  23,  1988. 

The  Technical  Committee  on  Technical  Information  (TCTI)  promotes 
the  development  of  aerospace  scientific  and  technical  information 
services.  The  TCTI  encourages  the  flow  of  technical  information 
throughout  the  aerospace  community  by  organizing  activities  which 
provide  a  forum  for  the  exchange  of  ideas  and  by  encouraging 
critical  evaluations  of  information  transfer  processes. 

Closing  Remarks.  Items  on  the  agenda  not  discussed  were 
recommendations  of  additional  committee  members  and  information 
vendors  ir  display  areas.  If  appropriate,  both  will  be  discussed 
at  the  next  TC  meeting. 


10-20 


ROUGH  MACHINE  TRANSLATION  N°1 


MOTS  TRAITES  :  361  COMPTE  AVAN’T  :  286163-  COMPTE  APRES  : 

286524- 

0100100000PDEM  EF  YPA  TG=4LPDEBUG=S  SYS=UCDATE=19  02  90 

11H15  0002544 


Technique  comity  %AIAA  d' information  technique 
Compte  rendu  du  13  juin,  1989,  r6unissant 
New  York 


Depuis  mai/juin  a  lieu  le  commencement  du  cycle  de  TC,  B.  Laurent 
dit  que  cette  deuxi^me  reunion  du  TC  §tait  appropriee  pour 
dtablir  des  minutes  formelles  avec  des  copies  au  personnel 
s61ectionn6  des  sieges  sociaux  %AIAA  ( 1 ) .  Une  liste  des  membres 
courants  de  TC  sera  soumise  aux  sieges  sociaux  ( 2 ) . 

R.  Lewis  a  proposd  que  des  ateliers  soient  pris  en  charge  ou  des 
dtudes  spdciales  soient  entreprises  sur  les  procedes  de 
l'ecoulement  de  1 ' information  et  des  technologies  specif iques 
telles  que  la  CD-SCROM  ou  1' Edition  dlectronique .  H.  Mindlin  a 
dit  qu'ASME  a  un  comity  de  base  de  donndes  qui  favorise  ce  type 
d' activity  et  represente  des  disciplines  differentes;  un  groupe 
semblable  peut  travailler  dans  la  structure  %AIAA.  La  matiere 
s6.lectionnde  est  "  transfert  de  1 '  information  61ectronique  et  son 
impact  sur  la  recherche  et  le  d6veloppement  a£rospatiaux  " .  Des 
listes  des  membres  des  associations  relives  par  information 
seront  employees  pour  viser  des  directeurs  dans  les  compagnies  et 
les  organismes  aerospatiaux  locaux  ( 8 ) .  Les  sieges  sociaux  %AIAA 
seront  entree  en  contact  sur  1 ' utilisation  d'une  salle  de 
confdsre  ice  et  expddier  les  brochures  ( 9 ) • 

Les  membres  de  TC  ont  discut§  1 ' importance  da  la  collaboration 
avec  1' autre  %TCs  sur  des  regimes  et  des  activities.  II  a  convenu 
que  le  TC  nomment  une  personne  conime  liaison  au  Comitd  de 
publications. 

Article  pour  l’Am§rique  a&rospatiale,  Les  articles  de  chaque  TC 
accentuant  les  activites  de  1 ' annde  par  discipline  sont  le  7 
septembre  dG  pour  la  publication  en  Amdrigue  aferospatiale.  La 
teneur  de  la  presentation  de  notre  %TC  sera,  sur  des 
d^veloppements  dans  1 ' information  technique  a^rospatiale  (%14‘)  . 

Charte  de  TC.  Une  brdve  discussion  a  6t§  maintenue  sur  dcrire  une 
charte  pour  le  TC.  Puisqu'entr6  de  l'apergu  aidera  a  determiner 
le  role  du  TC,  il  y  avait  convention  pour  augmenter  le  rapport  de 
port6e  fourni  par  B.  Laurent  dans  sa  lettre  du  23  mars,  1988. 
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Le  comitE  technique  de  1 1 information  technique  (%TCTI)  favorise 
le  dEveloppement  des  services  d ' information  scientifique  et 
technique  aErospatiale.  Le  %TCTI  encourage  l'Ecoulement 
d *  inf ormation  technique  dans  toute  la  communautE  aErospatiale  par 
1' organisation  des  activity  qui  fournissent  un  forum  pour 
I'Echange  des  idEes  et  par  des  Evaluations  critiques  d'encourager 
des  procEdEs  de  transfert  de  1* inf ormation. 

Observations  finales.  Les  articles  aux  ordres  du  jour  non 
discutEs  Etaient  des  recommandations  des  membres  de  comitE  et  des 
fournisseurs  addltionnels  de  1' information  dans  des  zones  de 
visualisation.  Si  appropriE,  tous  les  deux  seront  discutEs  lors 
de  la  prochaine  rEunion  de  TC^ 
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POST-EDITING  N°1 


Comit6  Technique  de  1 ' AIAA  sur  1 ' Information  Technique 
Compte  Rendu  de  la  Reunion  du  13  juin  1989 
New-York 


Etant  donne  que  les  mois  de  Mai  et  de  Juin  correspondent  au  d6but 
du  cycle  de  reunions  du  Comit6  Technique,  Mme  B.  Lawrence  a 
estim6  que  le  moment  etait  opportun  de  publier  le  compte  rendu 
officiel  et  d'en  envoyer  un  exemplaire  e  certains  membres  des 
sieges  sociaux  de  l'AIAA  (1).  Une  liste  des  membres  actuels  du 
Comite  Technique  sera  soumise  au  si&ge  social  (2). 

M.  R.  Lewis  a  propose  d' organiser  des  ateliers  ou  de  procdder  d 
des  etudes  speciales  sur  les  moyens  de  circulation  de 
1' information  et  les  techniques  specif iques  telles  que  le  CD-CROM 
ou  1' edition  eiectronique.  M.  K.  Mindlin  a  declare  qu’au  sein 
de  l'ASME,  une  commission  spdcialisde  dans  les  bases  de  donnees 
encourage  ce  genre  d' activity  et  agit  dans  diverses  disciplines. 
Un  groupe  semblable  pourrait  travailler  au  sein  de  l'AIAA. 

Le  sujet  retenu  est  "Le  Transfert  Electronique  de  1 ' Inf ormation 
et  ses  Repercussions  sur  la  Recherche  et  le  Ddveloppement  dans  le 
Domaine  A6ronautique  et  Spatial". 

Des  listes  des  membres  des  associations  liees  au  secteur  de 
1 ' information  seront  utilisdes  pour  localiser  les  directeurs  des 
soci6tes  et  organismes  rdgionaux  da..s  le  secteur  a6ronautique  et 
spatial  (8).  Le  siege  social  de  l’AIAA  sera  charge  de  trouver 
une  salle  de  conference  et  d'exp6dier  les  brochures  (9). 

Les  membres  du  Comit6  Technique  ont  discute  de  1' importance  de  la 
collaboration  avec  d'autres  Comit6s  Techniques  pour  les 
programmes  et  les  activites.  Le  Comite  Technique  nommera  une 
personne  pour  assurer  la  liaison  avec  le  Comite  de  Publications. 

Articles  pour  Aerospace  America.  Les  articles  de  chaque  Comite 
Technique  relatifs  aux  activites  de  l'annee  par  discipline  seront 
publi6s  le  7  septembre.  Les  conclusions  de  notre  Comite  Technique 
porteront  sur  le  developpement  de  1’ information  technique  dans  le 
monde  a6ronautique  et  spatial. 

Statuts  du  Comite  Technique.  Une  breve  discussion  a  eu  lieu  au 
sujet  de  1 'elaboration  de  statuts  du  Comite  Technique.  Etant 
donn6  que  les  resultats  de  T'enquete  aideront  a  determiner  le 
role  du  Comite  Technique,  il  a  ete  convenu  d’eiargir  le  domaine 
d ' application  defini  par  Mme  .B.  Lawrence  dans  sa  lettre  du  23 
mars  1988. 

Le  comite  technique  de  1 ' information  technique  (CTIT) 
encourage  le  developpement  des  services  d 'information 
scientifique  et  technique  dans  le  domaine  adronautique  et 
spatial.  Le  CTIT  promeut  dgalement  la  circulation  de 
1 ' information  technique  dans  toute  la  communaute  aerospatiale  en 
organisant  des  activites  qui  favorisent  les  echanges  d'idees  et 
en  encourageant  les  evaluations  critiques  des  procedes  de 
transfert  de  1' information. 
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Observations  finales.  Les  deux  articles  inscrits  &  l'ordre  du 
jour  et  qui  n'ont  pas  6td  discut6s  concernaient  des 
recommandations  des  membres  suppl6me»taires  de  la  commission,  et 
des  fournisseurs  d' information  lors  d 'expositions.  S'il  y  a  lieu, 
ces  deux  articles  feront  l'objet  d'une  discussion  lors  de  la 
prochaine  reunion  du  Comitd  Technique. 
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TEXT  N°  2 

"Non  destructive  testing  techniques" 


I  LANGUAGE  PAIR  : 

French  into  English 

II  GLOSSARIES  SELECTED  : 

-  Aviation  and  Space 

-  Chemistry 

-  Mechanical  engineering 


III  SUCCESSIVE  OPERATIONS  OF  THE  MACHINE  TRANSLATION  PROCESS  : 


-  Optical  reader 

-  Sending  a  text  for  translation 

-  Post  editing  /  Minimum 

/  Refined 

-  Number  of  words • translated 


— > 

Time 

6 

ran 

— > 

Time 

6 

mn 

— > 

Time 

10 

mn 

-r-> 

- > 

Time 

838 

20 

mn 

IV  ANALYSIS  OF  THE  ROUGH  MACHINE  TRANSLATION  : 


a/  Terminology 


Source  text 

Rough  machine  translation 

Human  translation 

Assemblages  par 
collage 

Assemblies  by  joining 

Bonded  assemblies 

(Faire  un)  pas 

(To  take  a)  pitch 

(To  make  a)  step 
forward 

Cuisson 

Cooking 

Curing 

Atout 

Trump 

Asset 

Systfeme  de 

Output  system 

Printout  system 

recopie 

Se  tourner  vers 

To  turn  to 

To  turn  towards 

Mettre  en  place 

To  install 

To  set  up 

(Produit)  r§a- 
lis6 

(Product)  carried  out 

(Product)  manufactured 

Procfeder  &  des 

To  carry  out  measurements 

To  perform 

mesures 

measurements* 

Ddcollements 

Separations 

Debondings 

Liquide  de 
couplage 

Fluid  of  coupling 

Couplant  fluid 

10-25 


b/  Defective  analysis  /  prepositions 


Source  text 

Rough  machine  translation 

Human  translation 

Un  (6ventail) 

•One  ( range ) 

A  (range) 

Des  (problSmes) 

Of  the  (problems) 

(problems) 

Des  meilleures 

Best  techniques 

The  best  techniques 

techniques 

De  nombreuses 

The  many 

Many 

Que  ce  soit. . . 

Whether  it  is 

Either  ...  or 

Aux  (USA) 

To  (the  USA) 

In  (the  USA) 

Experience  de 

Experience  of 

Experience  in 

Un  ensemble  de 

A  whole  of 

Many 

Les  besoins  de 

The  needs  for  (sb) 

The  needs  of  (sb) 

(quelqu'un) 

c/  Word  order 


Source  text 

Rough  machine  translation 

Human  translation 

Les,  industries 

Car  industries,  aeronautical 

The  aerospace,  car  and 

automobile. 

and  even  electronic 

even  electronics 

aferonautique  et 
mSme  felectroni- 

indutries 

que 

Les  m6thodes  les 

The  methods  the  most  adapted 

The  most  adapted 

plus  adaptfies 

methods 

Des  diffferents- 
partenaires 

Different  the  partners 

The  different  partners 

Concurrence 

Competition  foreign 

Foreign  competition 

6trang6re 

Li§s  justement  A 

Connected  precisely  with 

Precisely  connected 
with 

Ondes  de  cisail- 
lement  ou  de 
lamb 

Waves  of  shearing  or  lamb 

Shear  or  lamb  waves 

Analyse  modale 

Analyzes  modal 

Modal  analysis 

D6tect6s  rapi- 
dement 

Detected  quickly 

Quickly  detected 
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d/  Unrecognized  words 


Source  text 

Rough  machine  translation 

Human  translation 

Multipartenaire 

%multipartenaire 

Multipartner 

Partenaire 

Partenaire 

Partner 

V  FINAL  REMARKS 


As  far  as  French-speaking  companies  such  as  Aerospatiale  are 
concerned,  the  documents  to  be  translated  into  English  or  any  other 
non-French  language  are  generally  designed  to  be  dispatched  abroad 
and  thus  require  a  refined  post-editing  of  the  rough  machine 
translation.  In  such  cases,  the  Intervention  of  a  human  translator 
is  necessary  but  undoubtedly  remains  quicker  and  cheaper  than  in  an 
entirely  human  translation  process. 
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SOURCE  TEXT  H°2 


Les  techniques  de  controle  non  destructif 


Un  trds  large  dventail  d' industries  utilise  les  assemblages  par 
collage  mais  elles  sont  limitees  dans  leurs  applications  du  fait 
d'un  manque  de  techniques  de  contrSle  non  destructif  permettant 
de  detecter  des  defauts  pouvant  limiter  la  fiabilite  et  la  duree 
de  vie  de  la  structure.  Ces  ddfauts  sont  avant  tout  des  problemes 
d' adhesion  entre  la  colle  et  la  piece  et  des  problemes  de 
cohesion  (quality  de  la  colle)  qu'il  faut  detecter  et  evaluer 
d'uhe  maniere  non  dest'uctive.  Le  but  de  ce  pro jet  est  de 
rassembler  des  compdti  'ces  europdennes  afin  de  faire  un  pas 
ddcisif  dans  le  control  t  de  ces  structures  ( et  des  interfaces  en 
general )  en  mettant  au  point  de  nouvelles  techniques  ultrasonores 
permettant  d'amdliorer  la  rapidite  et  les  capacitds  de  detection. 
Ces  recherches  forment  un  programme  complet  des  meilleures 
techniques  que  l'on  peut  anvisager  dans  ce  domaine. 

L' interconnexion  entre  ces  diffdrents  laboratoires  permettra  de 
comparer  et  de  completer  les  techniques  qui  seront  alors  evaluees 
sur  un-  .round  .robin  .test.  Les  repercussions  sont  importantes 
dans  les  industries  automobile,  adronautique  et  m§me 
dlectronique. 

Si  de  nombreuses  industries  sont  tentdes  d'utiliser  les 
assemblages  par  collage,  elles  se  heurtent  le  plus  souvent  au 
probldme  de  1' assurance  de  la  qualitd  du  produit  final.  Cette 
qualite  se  base  sur  la  maitrise  des  procddes  de  fabrication  mais 
aussi  sur  un  contrdle  non  destructif  capable  de  mettre  en 
evidence  les  defauts  pouvant  limiter  la  duree  de  vie  de  la 
structure.  Ces  d6fauts  de  type  collectif  (porositd,  mauvaise 
cuisson)  ou  de.  type  adhdsif  (absence  de  contact  ou  contact  sans 
adhesion)  peuvent  se  produire  en  cours  de  fab.ication  et  se 
ddgrader  en  cours  de  service.  Les  mdthodes  les  plus  adaptdes  pour 
la  detection  de  ces  defauts  sont  avant  tout  des  techniques 
ultrasonores.  Tous  les  partenaires  de  ce  projet  ont  dejd  une 
serieuse  experience  dans  ces  techniques,  que  ce  sort  du  point  de 
vue  de  la  recherche,  du  ddveloppement  ou  de  1 ' utilisation 
d'appareils  deja  commercialisms  par  les  partenaires  1  et  3. 

Le  regroupement  des  competences  europdennes  permettra  de  faire 
une  evaluation  comparative  de  nouvelles  techniques  novatrices  en 
se  basant  sur  les  reflexions  des  diffdrents  partenaires  ainsi  que 
des  etudes  mendes  actuellement  aux  Etats-Unis. 

Tous  les  partenaires  de  ce  pro jet  ont  deja  une  serieuse 
experience  technique  dans  le  sujet  et  nous  pouvons  assurer  que  ce 
programme  permettra  d’obtenir  des  resultats  tout  d  fait 
satisfaiaants.  De  plus,  les  Partenaires  1,  2  et  3  ont  egalement 
une  grande  experience  de  programmes  multipartenaires  ce  qui  est 
un  atout  suppldmentaire  pour  le  succes  de  ce  projet. 

-  Reduction  des  temps  de  controle 

-  Faire  face  d  la  concurrence  etrangere  en  proposant  des  produits 
plus  fiables  et  mieux  congus,  que  ce  soit  dans  le  domaine 
aeronautique,  automobile  ou  dlectronique . 
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-  Augmentation  de  la  fiabilite  par  l'emploi  d'un  systeme  de 
recopie  (automatique  ou  semiautomatique j  connects  avec  un 
systeme  expert  limitant  1 ' interpretation  humaine. 

-  Meilleure  connaissance  des  comportements  des  colies  et  des 
joints  coll6s  conduisant  &  une  utilisation  plus  rationnelle  de 
ce  proc6de  d' assemblage. 

-  Gain  de  poids  du  fait  de  la  possibility  d'utiliser  des  renforts 
locaux  ie  ou  les  contraintes  sont  importantes. 

6.1  Presentation  gen6rale 

L'industrie  moderne  se  tourne  de  plus  en  plus  vers  1 'utilisation 
de  materiaux  composes  de  couches  ou  de  protections  successives  et 
qui  n6cessitent  de  mettre  en  place  des  methodes  non  destructives 
de  contrSle  afin  d' assurer  la  quality  des  interfaces  et  du 
produit  fini.  Ceci  est  particulierement  vrai  pour  les  assemblages 
par  collage  qui  pr6sentent  de  nombreux  avantages  (reduction  de 
poids,  meilleure  repartition  des  contraintes)  et  qui  permettent 
des  conceptions  de  structure  ■;  ou  des  positionnements  impossibles 
A  realiser  par  soudage  ou  ri,etage. 

Cependant  1 'utilisation  intensive  de  ces  assemblages  se  heurte  a 
des  probiemes  li6s  justement  £  1' assurance  de  la  quality  du 
produit  r6alis6. 

6.2  Contenu  scientifique  du  pro jet 

Pour  atteindre  les  objectifs  ddcrits  dans  le  paragraphe  2  et  pour 
rdsoudre  les  probiemes  exposes  ci-dessus,  nous  mettrons  au  point 
un  ensemble  de  nouvelles  techniques  ultrasonores. 

Des  aspects  fondamentaux  seront  abord6s  en  etudiant  les 
proprietes  des  cSbles  et  des  assemblages  dans  le  but  de  repondre 
aux  besoins  des  bureaux  d' etudes  pour  le  dimensionnement  et  la 
comprehension  des  phenomenes  de  degradation  &  l'dchelle 
micrometrique  (Partenaires  1  et  4). 

Des  etudes  seront:  mer.ees  pour  la  detection  des  d6fauts  d' adhesion 
en  utilisant  deM'  oniies  de  cisaillement  ou  de  .lamb  sur  une  large 
gamme  de  fr6qe6ricfcs  (de  1  §  100  Hz),  (Partenaires  3  et  5). 

Des  essais  en  vibration  seront  effectu6s  soit  pour  proceder  A  des 
mesuies  plus  globales  sur  une  structure  (analyse  modale, 
Partenaire  2)  ou  encore  pour  creer  des  decollements  sur  les  zones 
de  mauvaises  adhesions  pour  un  .proof  .test  ultrasonore 
(Partenaire  6).  Ces  ddfauts  pourront  alors  etre  d6tect6s 
rapidement  avec  des  oapteurs  rotatifs  sans  llquide  de 
couplage  (Partenaire  1  avec  support  theorique  du  Partenaire  7). 

Toutes  ces  etudes  seront  menees  dans  le  but  de  constituer  un 
systeme  expert  e  partir  de  1' extraction  des  informations 
representatives  des  defauts  recherches  (Partenaire  2  en 
collaboration  avec  les  autres  Partenaires). 
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ROUGH  MACHINE  TRANSLATION  N°  2 


MOTS  TRAITES  :  838  COMPTE  AVANT  :  252161-  COMPTE  APRES  : 

252999- 

0100100000P5123  FE  YPA  TG=495DEBUG=S  SYS=UCDATE=18  10  89 

14H32  0005645 


Techniques  of  non  destructive  testing 


One  very  broad  range  of  industries  uses  the  assemblies  by  joining 
but  they  are  limited  in  their  applications  because  of  a  lack  of 
techniques  of  non  destructive  testing  making  it  possible  to 
detect  defects  being  able  to  limit  the  reliability  and  the 
lifespan  of  the  structure.  These  defects  are  above  all  problems 
of  adhesion  between  the  adhesive  and  the  part  and  of  the  problems 
of  cohesion  (quality  of  the  adhesive)  which  it  is  necessary  to 
detect  and  evaluate  in  a  non  destructive  way.  The  aim  of  this 
project  is  to  gather  European  competences  in  order  to  take  a 
decisive  pitch  in  the  control  of  these  structures  (and  of  the 
interfaces  in  general)  by  developing  the  new  ultrasonic 
techniques  making  it  possible  to  improve  the  speed  and  the 
capacities  of  detection.  This  research  forms  a  complete  program 
of  best  techniques  than  one  can  consider  in  this  field.  The 
interconnection  between  these  various  laboratories  will  make  it 
possible  to  compare  and  supplement  techniques  which  will  then  be 
evaluated  on  a  round  robin  test.  The  repercussions  are 
significant  in  the  car  industries,  aeronautical  and  even 
electron.*  c . 

If  the  many  industries  are  tempted  to  use  the  assemblies  by 
joining,  they  generally  run  up  against  the  problem  of  the  quality 
assurance  of  the  end  product.  This  quality  is  based  on  the 
control  of  the  methods  of  manufacture  but  also  on  a  non 
destructive  testing  able  to  highlight  the  defects  being  able  to 
limit  the  lifespan  of  the  structure.  These  defects  of  the 
collective  type  (porosity,  the  bad  cooking)  or  of  adhesive  type 
(absence  of  contact  or  contact  without  adhesion)  can  occur  in  the 
course  of  manufacture  and  be  degraded  in  the  course  of  service. 
The  methods  the  most  adapted  for  detection  of  these  defects  are 
above  all  ultrasonic  techniques.  All  the  partners  of  this  project 
have  already  serious  experience  in  these  techniques,  whether  it 
is  from  the  point  of  view  of  research,  development  or  use  of 
aircraft  already  marketed  by  partners  1  and  3. 

The  regrouping  of  European  competences  will  make  it  possible  to 
make  a  comparative  evaluation  of  the  new  innovative  techniques 
while  being  based  on  the  reflexions  of  different  the  partners  as 
well  as  studies  currently  undertaken  to  the  United  States. 

All  the  partners  of  this  project  have  already  serious  technical 
experience  in  the  subject  and  we  can  ensure  that  this  program 
will  make  it  possible  to  obtain  completely  satisfactory  results. 
Moreover,  Partners  1,  2  and  3  also  have  great  experience  of 
programs  %multipartenaires  what  is  an  additional  trump  for  the 
success  of  this  project. 

-  Reduction  of  times  of  control 
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-  Deal  with  the  competition  foreign  by  proposing  more  reliable 
and  better  designed  products,  either  in  the  aeronautical  field, 
automobile  or  electronic. 

-  Increase  in  reliability  by  the  use  of  an  output  system 
(automatics  or  semiautomatic)  connected  with  an  expert  system 
limiting  human  interpretation. 

-  Better  knowledge  of  the  behaviors  of  the  adhesives  and  the 
adhesive  bonded  joints  leading  to  a  more  rational  use  of  this 
method  of  assembly. 

-  Gain  of  weight  because  of  the  possibility  of  using  local 
reinforcements  where  the  stresses  are  significant. 

6.1  General  Presentation 

Modern  industry  turns  more  and  more  to  the  use  of  materials  made 
up  of  layers  or  successive  protections  and  which  require  to 
install  non  destructive  methods  of  control  in  order  to  ensure  the 
quality  of  the  interfaces  and  finished  product.  This  is 
particularly  true  for  the  assemblies  by  joining  which  have  many 
advantages  (reduction  of  weight,  better  distribution  of  the 
stresses)  and  which  allow  designs  of  structures  or  positionings 
impossible  to  realize  by  welding  or  riveting. 

However  the  intensive  use  of  these  assemblies  encounters  problems 
connected  precisely  with  the  quality  assurance  of  the  product 
carried  out. 

6.2  Scientific  Contents  of  the  project 

To  achieve  the  goals  described  in  paragraph  2  and  to  solve  the 
problems  mentioned  above,  we  will  develop  a  whole  of  the  new 
ultrasonic  techniques. 

Fundamental  aspects  will  be  approached  by  studying  the  properties 
of  the  cables  and  the  assemblies  with  the  aim  of  meeting  the 
needs  for  the  design  offices  for  the  dimensioning'  and  the 
comprehension  of  the  phenomena  of  degradation  on  a  micrometric 
scale  (Partners  1  and  4)  .  ■ 

Studies  will  be  undertaken  for  the  detection  of  the  defects  of 
adhesion  by  using  waves  of  shearing  or  lamb  on  a  broad  frequency 
range  (from  1  to  100  Hz)  ,  (Partners  3  and  5)  . 

Tests  in  vibration  will  be  carried  out  either  to  carry  out  more 
total  measurements  on  a  structure  (analyzes  modal,  Partenaire  2) 
or  to  create  separations  on  the  areas  of  the  bad  adhesions  for  an 
ultrasonic  proof  test  (Partner  6)  .  These  defects  could  then  be 
detected  quickly  with  rotary  sensors  without  fluid  of  coupling 
(Partner  1  with  theoretical  support  of  Partner  7)  . 

All  these  studies  will  be  carried  out  with  the  aim  of 
constituting  an  expert  system  starting  from  the  extraction  of 
representative  information  of  the  required  defects  (Partner  2  in 
collaboration  with-  the  other  Partners )  . 
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POST-EDITING  N°2 


NON  DESTRUCTIVE  TESTING  TECHNIQUES 


A  wide  range  of  industries  uses  bonded  assemblies  but  their 
applications  are  limited  because  of  a  lack  of  non  destructive 
testing  techniques  for  detecting  defects  which  could  limit  the 
reliability  and  the  life  of  the  structure.  These  defects  are 
above  all  problems  of  adhesion  between  the  adhesive  and  the  part 
and  problems  of  cohesion  (quality  of  the  alhesive)  which  must  be 
detected  and  evaluated  by  a  non  destructive  method.  The  aim  of 
this  project  is  to  gather  European  skills  in  order  to  make  a 
decisive  step  forward  in  the  control  of  these  structures  (and  of 
the  interfaces  in  general)  by  developing  new  ultrasonic 
techniques  which  permit  an  improvement  in  speed  and  capacities 
of  detection.  These  studies  form  a  comprehensive  programme  of 
the  best  techniques  that  can  be  considered  in  this  field.  The 
interconnection  between  these  various  laboratories  will  make  it 
possible  to  compare  and  supplement  techniques  which  will  then  be 
evaluated  in  a  round  robin  test.  The  repercussions  are 
significant  in  the  aerospace,  car  and  even  electronics  industry. 


If  a  large  number  of  industries  are  tempted  to  use  the  bonded 
assemblies,  they  generally  come  up  against  the  problem  of  the 
quality  assurance  of  the  end  product.  This  quality  is  based  on 
the  control  of  the  production  methods  but  also  on  a  non 
destructive  testing  able  to  detect  defects  which  could  limit  the 
structure  life. 

These  defects  of  the  cohesive  type  (porosity,  faulty  curing)  or 
adhesive  type  (lack  of  contact  or  contact  without  adhesion)  can 
occur  during  production  and  worsen  whilst  in  service.  The  most 
adapted  method  for  detection  of  these  defects  are  above  all 
ultrasonic  techniques.  All  the  partners  in  this  project  have 
already  a  serious  experience  on  these  techniques,  either 
considering  research  development  or  use  of  devices  already 
marketed  by  partner  1  and  3. 

Gathering  European  skills  will  permit  a  comparative  evaluation  of 
the  new  innovative  techniques,  using  the  reflections  of  the 
different  partners  as  well  as  studies  currently  carried  out  in 
the  USA. 


All  the  partners  in  this  project  have  already  a  serious  technical 
experience  on  the  subject  and  we  can  ensure  that  this  programme 
will  permit  to  obtain  completely  satisfactory  results.  Moreover, 
Partners  1,  2,  3  have  a  great  experience  in  multipartner 
programmes,  which  is  an  additional  asset  for  the  success  of  tnis 
project. 


-  Reduction  of  control  times  (factor  5) 

-  To  deal  with  the  foreign  competition  by  proposing  more 
reliable,  better  conceived  products  in  the  aerospace,  car  or 
electronics  industry. 
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-  Increase  in  reliability  by  the  use  of  a  printout  system 
(automatic  or  semiautomatic)  connected  to  an  expert  system 
limiting  human  interpretation. 

-  Better  knowledge  of  the  behaviours  of  adhesives  and  adhesive 
bonded  joints,  leading  to  a  more  rational  use  of  this  joining 
method. 

-  Weight  saving  because  of  the  possibility  to  use  local 
reinforcements  where  stresses  are  high. 


6.1  General  presentation 


The  modern  industry  turns  more  and  more  towards  the  use  of 
materials  made  of  layers  or  successive  protections,  which 
require  to  set  up  non  destructive  control  methods  in  order  to 
ensure  quality  of  the  interfaces  and  of  the  finished  product. 

This  is  particularly  true  for  bonded  assemblies  which  have  many 
advantages  (weight  saving,  better  stress  distribution)  and  which 
allow  structure  designs  or  positionings  impossible  to  achieve  by 
welding  or  riveting. 

However  the  intensive  use  of  these  assemblies  encounters  problems 
precisely  connected  with  the  quality  assurance  of  the  product 
manufactured. 


6.2  Scientific  contents  of  the  project 


To  achieve  the  objectives  described  in  pa'  iraph  2  and  to  solve 
the  problems  mentioned  above,  we  will  dev  op  new  ultrasonic 
techniques. 

Fundamental  aspects  will  be  approached  while  studying  the 
properties  of  the  adhesives  and  of  the  assemblies  aiming  at 
meeting  the  needs  of  the  design  offices  for  the  dimensioning  and 
the  understanding  of  the  phenomena  of  degradation  on  a 
micrometric  scale  ( Partner  1  and  4 ) . 

Studies  will  be  carried  out  for  detecting  adhesion  defects  by 
using  shear  or  Lamb  waves  on  a  wide  frequency  range  ( from  1  to 
100  Hz),  (Partner  3  and  5). 

Vibration  tests  will  be  carried  out  either  to  perform  more  global 
measurements  on  a  structure  (modal  analysis  Partner  2)  or  to 
create  debondings  on  the  areas  of  faulty  bonding  adhesions  for 
a  proof  ultrasonic  test  (Partner  6).  These  defects  could  then 
be  quickly  detected  with  wheel  sensors  without  couplant 
fluid  (Partner  1  with  theoretical  support  of  Partner  7). 

All  the  studies  carried  out  will  aim  at  constituting  an  expert 
system  from  the  extraction  of  data  representative  of  the  defects 
investigated  (Partner  2,  in  connection  with  other  Partners). 
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TEXT  N°  3 


"Benefits  of  Computer  Resisted  Translation 
for  the  Heads  of  Information  Centers 
(Background  paper) 


I  LANGUAGE  PAIR  : 

French  into  English 

II  GLOSSARIES  SELECTED  : 

-  Computers  /  Data  processing 

-  Political  science 

-  Aviation  and  Space 


III  SUCCESSIVE  OPERATIONS  OF  THE  MACHINE  TRANSLATION  PROCESS  : 


-  Optical  reader 

-  Sending  a  text  for  translation 

-  Post  editing  /  Minimum 

/  Refined 

-  Number  of  words  translated 


— > 

Time 

2 

mn 

— > 

Time 

2 

mn 

— > 

Time 

5 

mn 

— > 

— > 

Time 

379 

10 

mn 

IV  ANALYSIS  OF  THE  ROUGH  MACHINE  TRANSLATION  : 


a/  Terminology 


Source  text 

Rough  machine  translation 

Human  translation 

Int6rSt  (de  qch 

Interest  of  sth  for  sb 

Benefits  of  sth  for  sb 

pour  qqn) 
Responsables  (de 
centres  d'infor- 

Persons  responsible  (for 

Heads  of  information 

centers  of  information) 

centers 

[nation ) 
Applications 

Implementations 

Applications 

Socidtfe 

Society 

Company 

Disposition 

Provision 

Layout 

Presenter 

To  forward 

To  present 
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b/  Grammar 


Source  text 

Rough  machine  translation 

Human  translation 

Textes  k 
traduire 

Texts  for  translation 

Texts  to  be  translated 

c/  Defective  analysis  /  prepositions 


Source  text 

Rough  machine  translation 

Human  translation 

Besoins  de  qqn 

Needs  for  sb 

Needs  of  sb 

Attacher  de 

To  attach  importance  to  . . . 

To  attach  importance 

1 ' importance  k . . 
nl  a  ... 

nor  with  . . . 

to  . . .  nor  to  ... 

Qualit6  de  qch 

Quality  for  sth 

Quality  of  sth 

Ext6rieures  k 

External  at 

External  to 

Apporter  qch 

To  bring  sth  to  . . .  and 

To  bring  sth  to  . . . 

d  ...  et  k  ... 

with 

and  to  ♦ • ♦ 

Des  exemples  . . . 
seront  . . . 

Of  the  examples  will  be  . . . 

Examples  will  be  ... 

En  ddveloppement 

In  development 

Under  development 
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d/  Word  order 


Source  text 

Rough  machine  translation 

Human  translation 

Traduction 

Translation  Computer- 

Computer  Assisted 

assistfee  par 
ordinateur 

assisted 

Translation 

N'ont  en  gfenferal 

Do  not  need  in  general  to 

Generally  do  not  need 

pas  besoin  de... 

to 

Sera  normalement 

Will  be  normally  sufficient 

Will  normally  be 

suffisante 

sufficient 

P rob 1 femes 

Problems  technical  and 

Technical  and  human 

techniques  et 
humains 

human 

problems 

V  FINAL  REMARKS 

Summaries  of  voluminous  documents  or  conferences  may  provide  a  very 
useful  first  approach-  to  a  new  text.  A  machine  translation  system 
is  thus  a  high-performance  tool  enabling  for  example  librarians  to 
rapidly  know,  in  their  own  language,  the  broad  content  of  a 
document  thus  making  it  much  easier  for  them  to  file  and  classify  a 
large  amount  of  texts. 
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SOURCE  TEXT  N°3 


Interet  que  peut  presenter 
la  Traduction  Assistee  par  Ordinateur  pour 
les  responsables  de  centres  d ' information 
(Document  de  travail  provisoire) 


Resume. 

Dans  cet  expose,  les  deux  applications  de  Traduction  Assistde  par 
Ordinateur  sei-ont  abord6es  :  la  Traduction  Assistee  par 
Ordinateur  qui  a  pour  but  de  produire  des  textes  destines  &  etre 
diffuses  k  l’exterieur  d'une  societe  et  la  Traduction  Assistee 
par  Ordinateur  qui  vise  A  rassembler  des  informations  pour  des 
applications  internes. 

La  dernidre  application  exige  des  lexiques  considerables, 
couvrant  un  large  6ventail  de  textes  et  de  domaines  techniques 
mais  elle  n'accorde  pas  d' importance  k  la  disposition  ni  k  la 
presentation  des  informations.  Dans  ce  cas,  les  textes  traduits 
n'ont  en  g6n6ral  pas  besoin  d'etre  corriges  cat  la  traduction 
brute  sera  normalement  suffisante  pour  que  les  utilisateurs 
finaux  aient  une  id6e  approximative  du  contenu  des  textes.  La 
vitesse,  cependant,  est  importante. 

Si  les  systemes  de  Traduction  Assistee  par  Ordinateur  sont 
utilises  pour  produire  des  publications  destinees  k  des  personnes 
exterieures  d  la  societe,  d'autres  crit6res  doivent  etre  pris  en 
consideration,  c'est-d-dire  : 

-  Facilite  d ' importation  des  textes  k  partir  de  divers  serveurs, 

-  Conservation  de  la  presentation  et  de  la  disposition  des 
informations, 

-  Possibilite  pour  1 'utilisateur  final  de  mettre  k  jour  des 
lexiques  et  d'influencer  la  traduction, 

-  Trds  bonne  qualite  de  la  traduction, 

-  Outils  d  disposition  pour  la  correction  de  la  traduction  brute, 

-  Vitesse  de  traduction, 

-  Capacite  du  systdme  a  intdgrer  des  developpements  ulterieurs. 


La  conference  intituiee  "1* interet  de  la  Traduction  Assist6e  par 
Ordinateur  pour  les  responsables  de  centres  d* information  et  pour 
les  utilisateurs  finaux"  a  pour  but  de  montrer  l'int6ret  que  la 
Traduction  Assistee  par  Ordinateur  peut  apporter  non  seulement  au 
responsable  d'un  centre  d ' information,  mais  egalement  k 
1 'utilisateur  final.  Apris  avoir  d6fini  les  systemes  existants, 
la  nature  des  textes  k  traduire,  les  problemes  techniques  et 
humains  lies  k  1' utilisation  des  systemes  et  les  besoins  des 
utilisateurs  finaux  (qualite  des  traductions,  connaissance  de 
1 ' information  dans  la  langue  maternelle. . . ) ,  des  exemples 
d ' applications  en  cours  ou  en  d6veloppement  seront  presentes .  Ces 
diverses  applications  permettront  de  mettre  en  evidence  les 
avantages  de  ces  systemes  pour  les  centres  d ' information  et  de 
proposer  des  solutions  au  benefice  de  1 'utilisateur  final. 
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ROUGH  MACHINE  TRANSLATION  N°3 


« 

MOTS  TRAITES  :  379  COMPTE  AVANT  :  256101  -  COMPTE  APRES  : 

256480- 

0100100000P502  FE  YPA  TG=3P4DEBUG.=S  SYS=UCDATE=19  10  89 
09H41  0002668 


Interest  of  the  Translation  Computer-assisted  for  the  persons 
responsible  for  -centers  of  information  (provisional  working 
document ) 

Summary. 

In  this  statement,  the  two  following  implementations  of 
Translation  Computer-assisted  will  be  approached  :  The 
Translation  Computer-assisted  which  is  intended  to  produce  texts 
intended  to  be  diffused  outside  a  society  and  the  Translation 
Computer-assisted  which  aims  at  gathering  information  for 
internal  implementations . 

The  last  implementation  requires  considerable  dictionaries, 
covering  a  broad  range  of  texts  and  technical  fields  but  it  does 
not  attach  importance  to  the  provision  nor  with  the  presentation 
of  information.  In  this  case,  the  translated  texts  do  not  need  in 
general  to  be  corrected  for  the  raw  translation  will  be  normally 
sufficient  so  that  the  end-users  have  an  approximate  idea  of  the 
contents  of  the  texts.  Speed,,  however,  is  significant. 

If  the  translation  systems  Computer-assisted  are  used  to  produce 
publications  intended  to  persons  external  at  society,  other 
criteria  must  be  taken  into  account,  i.e.  : 

-  Facility  of  importation  of  the  texts  starting  from  various 
hosts, 

-  Conservation  of  the  presentation  and  the  provision  of 
information, 

-  Possibility  for  the  end-user  of  updating  dictionaries  and  of 
influencing  the  translation, 

-  The  Very  good  quality  of  the  translation, 

-  Tools  at  diposal  for  the  correction  of  the  raw  translation, 

-  Speed-  of  translation, 

-  Capacity  of  the  system  to  integrate  later  developments. 

The  conference  entitled  "  the  interest  of  the  Translation 
Computer-assisted  for  the  persons  responsible  for  centers'  of 
information  and  for  the  end-users  "  is  intended  to  show  the 
interest  that  the  Translation  Computer-assisted  can  bring  not 
only  to  the  person  responsible  for  a  center  of  information  but 
also  with  the  end-user.  After  having  defined  the  existing 
systems,  the  nature  of  the  texts  for  translation,  problems 
technical  and  human  connected  with  the  use  of  the  systems  and  the 
needs  for  the  end-users  (quality  for  the  translations,  knowledge 
of  information  in  the  mother  tongue.  .  ),  of  the  examples  of 

implementations  in  progress  or  in  development  will  be  forwarded. 
These  various  implementations  will  allow  to  highlight  the 
advantages  of  these  systems  for  th  centers  of  information  and  to 
propose  solutions  for  the  benefit  of  the  end-user. 
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POST-EDITING  N°3 


Benefits  of  Computer  Assisted  Translation 
for  the  Heads  of  Information  Centers 
(Background  paper) 


Abstract: 

In  this  paper,  the  two  following  applications  of  Computer 
Assisted  Translation  will  be  dealt  with  :  the  Computer  Assisted 
Translation  intended  to  produce  texts  to  be  dispatched  outside  a 
company  and  the  Computer  Assisted  Translation  which  aims  at 
gathering  information  for  internal  applications. 

The  latter  application  requires  extensive  lexicons,  covering  a 
wide  range  of  texts  and  technical  fields  but  need  not  be 
concerned  with  the  layout  nor  with  the  presentation  of 
information.  In  this  case,  the  translated  texts  generally  do  not 
need  to  be  corrected,  for  the  rough  translation  will  normally  be 
sufficient  for  the  end  users  to  have  an  approximate  idea  of  the 
content  of  the  texts.  Speed,  however,  is  significant. 

If  Computer  Assisted  anslation  systems  are  used  to  produce 
publications  intended  k.<j  third  parties,  other  criteria  must  be 
taken  into  account,  i.a.  : 

-  Ease  of  text  import  from  various  host  systems, 

-  Preservation  of  the  presentation  and  the  layout  of  information, 

-  Possiblity  for  the  end-user  of  updating  lexicons  and  of 
influencing  the  translation, 

-  A  very  high  translation  quality, 

-  Tools  at  disposal  for  the  correction  of  the  rough  translation, 

-  Speed  of  translation, 

-  Capacity  of  the  system  to  incorporate  future  developments. 

The  conference  entitled  "the  benefits  of  the  Computer  Assisted 
Translation  for  the  heads  of  information  centers  and  for  the  end- 
users"  is  intended  to  show  the  interest  that  Computer  Assisted 
Translation  can  bring,  not  only  to  the  head  of  an  information 
center,  but  also  to  the  end-user.  After  having  defined  the 
existing  systems,  the  nature  of  the  texts  to  be  translated,  the 
technical  and  human  problems  connected  with  the  use  of  the 
systems  and  the  needs  of  end-users  (quality  of  the  translations, 
information  knowledge  in  the  mother  tongue...),  examples  of  on¬ 
going  applications  and  systems  under  development  will  be 
presented.  These  various  applications  will  make  it  possible  to 
highlight  the  advantages  of  these  systems  for  information  centers 
and  to  propose  solutions  for  the  benefit  of  the  end-user. 
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INFO/GO 
09  04 

Transfert  informat  ion* :Barri4re;TraCuct ion  machine jRecnercne 
CEfense 

Ecnange  Informat  ion* ;£change  lnternationat*:Barrl4ro  llnguistiquo, 
Recherche  a4rospatia!e 


C-88'009085 

Traduction  avec  acc4s  mEaoIre  direct. 

Oirect  memory  acces  translation. 

10th  International  Joint  Conference  on  Artificial  Intelligence 
(XJCAI  87). 

Milan.  IT 

1987/08/23-1987/08/28 
TOMABECHI  H. 

Carnegie  Mellow  univ,,  Pittsburgh,  US 
MEmoire  CongrEs 
ENG 
US 

Morgan  Kaufmann  (Los  Altos) 

VOL  2/2;  PO.  722-725;  16  Ref.:  1  Fig.;  OP.  1987 

0-934-61343-5 

05:  M  5234 

Pr4sentation  C'une  thEorie  cans  laquelle  la  traauction  est 
eoo$tcEr4e  cow*  partie  intEgrante  cu  trattement  cognitif .Dans  ce 
paraofgs*.  la  comprEnension  en  Iangage  source  est  une 
reconnaissance  ces  entrEes.  en  termes  ce  connaissances  exlstant  en 
fsEnotre.  suivle  c'une  IntEgratlon  ae  ces  entrEes  cans  la 
oEmoire.La  tracuctlon  Etant  offectuEe  avec  accEs  direct  au  rEseau 
ce  reEroir#  o'autres  processus  cognftlfs  (infErence.  par  example) 
peuvent  participer  cynamiaueoent  i  cette  tracuctlon  (creation  ce 
nouveaux  concepts:apprent1ssage  d'un  nouveau  vocabulairo). 

INFO/CR 
05  07;  06  04 

Tracuctlon  machine*: Mach In#  apprentissage* 
accEs  mEmoiresTnEorie  de  la  connatssance:Analyse  lexlcale; 
Tfaltement  paral IE!e:REgle  infErencejPartage  mEmol re; iangage 
natural 


C-88-009084 

Architecture  O'analyseur  syntaxique  universe!  pour  ’"reduction" 
par  ••nacnine**  intelligent#. 

The  universal  parser  architecture  for  knowiecge-basec  machine 
translation. 

10th  International  Joint  Conference  on  Artificial  Intelligence 
(XJCAI  87). 

Milan,  IT 

1987/08/23-1987/08/28 
TOMITA  M.:  CARBONELL  J.  G. 

Carnegie  Mellon  univ.,  Pittsburgh,  USjCarnegie  Mellon  Univ.. 

Pittsburgh,  US 

vEmoire  CongrEs 

ENG 

US 

Morgan  Kaufmann  (Los  Altos) 

VOL  2/2;  pp.  718-7211  17  Ref.:  2  fig.:  OP.  1987 
0-934-61343-5  i 
05:  U  5234 

Une  **tracuctlon**  par  ••machine**  colt  Etre  sEaantiqueoent 
prEcise,  linguist iquement  correct#,  interactive  et  extensible  4 
plusieurs  langues  et  conalnes.Une  architecture  d'analyseur 
universe)  s'efforce  O'atteincre  l 'ensemble  ce  ces  ocjectifs.Oes 
bases  ce  connaissances  linguist iques  (syntax#,  sEmantique, 

UxiQue.  pragmatism#)  cocEes  sous  ces  formes  appropriEes 
(gramoalre)  sont  modlflEe*  et  prEcompllEes  e<>  vuo  ce  1 'analyse 
syntaxique  et  ce  la  gEnEration  ces  textes. Les  premiers  rEsultats 
ce  tracuctlons  oldirectionnelles  anglais  et  Japonais  snt 
encourageants  et  cEmontrent  la  faisabilitE  thEc-ique  ce  1'approcne. 
INFO/CR 
05  07;  09  02 

Traauction  machine*;Architecture  calculates*  ;$E«antiq<je; 

Linguist  icuo  automat  <$E«;Gramair« 

Analyse  syntaxique* ;8ase  ce  connaissance 


C-88-008796 

Analyse  syntaxloue  structural#  mocifU  pour  systEmes  ce 
comer Enension  ce  Iangage. 

Modified  caseframe  parsing  for  speech  understanding  systems. 
10th  Internationa)  Joint  Conference  on  Artificial  Intelligence 

(XJCAI  87). 

Milan,  IT 

1987/08/23-1987/08/28 
PCE5I0  M.j  RULLENT  C. 

C5ELT,  Torino.  IT:CSELT,  Torino.  IT 


Ii3 


Minotro  Congris 

ENG 

IT 

Morgan  Kaufaann  (Los  Altos) 

VO L  2/2!  CD.  622*625;  12  Rf f.;  6  Fig.;  CP.  1987 

0-934*61343*5 

05;  M  5234 

Proposition  d'une  stratigt#  d'analys#  ct  fortes  syntaxes*,  pour 
systimes  ae  comprirtnsion  ce  parole,  diffirant  ces  strategies 
classicues  par  aux  moins  deux  aspects  :  I 'analyst  nt  repose  pas 
unlQuesent  sur  on  processus  descendant  et  Us  forois  syntaxiquts 
sent  amaloaMes  4  one  connalssance  syntaxlout  avant  d'itrt 
utilises.  Cette  strangle  peut  it  re  txicutit  coemo  un  processus 
d' inference. 

XNFO/CR 
17  02;  OS  07 

Reconnaissance  parott*:$4*Anttcue*;Llngul5t1que  automat isie; 
7raauctlon  macnint 

Analyse  syntaxlQu«*;£sprit  programme; Forme  syntactiQue:Rigle 
Inference. Langage  naturel 


0*88*008430 

Constructtlon  d'lnterfaces  on  langage  naturel  pour  systimes 
experts  basts  rtglos. 

Bulding  natural  language  Interfaces  for  rule-cased  expert  systems. 
lOtn  international  Joint  Conference  on  Artificial  intelligence 
(XJCAI  87). 

Milan,  XT 

1997/08/23*1987/08/28 

MOEROLER  G.  0,;  M3<E0WN  K.  R. :  ENSOR  J.  R> 

Colurola  Onlv.,  New  York,  US;Colu»ola  l/niv, ,  New  York,  U$;AT  and  T 

Gel  I  Lacs.,  Moimdei,  US 

Mimolre  Congris 

ENG 

US 

Morgan  Kaufmann  (Los  Altos) 

VOL  2/2:  PP.  682-687;  25  Ref.;  5  Fig.,  0*  ’987 

0-934-61343-5 

05:  M  5234 

Etude  d'un#  simantiout  pour  tracuction  ct  pnrases  en  langage 
naturel.  en  informations  factual les  pour  un  systime  expert 
soos-Jactnt.  en  remplacement  de  1' interface  4  menu  plus 
convent lonnell#  util isie  pour  col  teeter  les  Connies  fournies  par 
1 'util isateur. Description  de  deux  proolimes  rtncontris  dans  la 
construction  oe  ce  nouveau  type  d' Interfaces  pour  systimes  experts 
;  le  traiteaent  simantique  des  pnrases  ce  t 'util isateur  et  la 
conception  d'un  interpret,  pour  Jo  systime  expert.  utUlsaot 
efficactnent  les  connits  factueites  fournies  tar  l'utilisateur. 
1NF0/CR 
09  02:  05  07 

Systime  expert *;Traouct Ion  machine*  .'Relation  nonme  oacnine; 

Si-nant lcue;ACQulsit ion  oonnie; Inter pritat ion 

Langage  naturel *;Base  Connie  factual le:Programe  tcaducteur;Moteur 

d  infirence:6ase  de  connalssance 


C -88 -0083 28 

Ccmprintnsion  ce  specifications  de  systimes.  icrttes  en  langage 
naturel. 

Understanding  system  specifications  written  in  natural  langage. 
lOtn  International  Joint  Conference  on  Artificial  Intelligence 
(XJCAI  87). 

Milan,  XT 

1987/00/23*1987/08/28 

GRANACKI  J,  J.;  PARKER  A.  C. :  ARENS  Y. 

Univ.of  Southern  California,  Los  Angeles,  US;Univ.of  Southern 

California,  Los  Angeles.  USsUniv.of  Soutnea  California,  Los 

Angeles,  US 

Mimolre  Congris 

ENG 

US 

Morgan  Kaufmann  (Los  Altos) 

VOL  2/2;  pp.  688*691;  12  Ref.:  2  Fig.:  1  Tabl,:  OP.  1987 

0-934-61343-5 

05;  M  5234 

Description  de  recnercnes  sur  la  comprinension  oe  specifications 
systiaiques  Rentes  tn  langage  naturel  Ces  recnercnes  comportent 
la  olse  en  oeuvre  oe  {'interface  PHRAN-SPAN  (‘Phrasal 
Analyser-Specification  Analysis*),  pour  la  specif icatlon  ou 
comportement  aostralt  ce  systimes  numiriuues  cans  un  text#  anglais 
Haiti,  avec  le  systime  adam  (‘Advanced  Design  Auto  Matlon*)  de 
I'University  of  Soutnea  California.*. 

XNFO/CR 
05  07 

Secantioo4*:Ccoceptlon  asslstie  par  calculateur;Traduction  machine; 
Spicif icjifon 

Largage  /ni^8lt;XRjlyji  syntaxiauejEtude  conception  systime; 
Structure  Connie 


0*88*008327 

Reprisentatlon  et  Interpriut ipn'tfc  oiteralpants  cans  un  langage 
naturel. 

Representat Jon  and  interpretation  of  detirj’ner*  ir.  natural 
language. 

10th  International  Joint  Conference  on  ArtJflcai  Intelligence 

(XJCAI  87). 

Milan,  XT 

1987/08/23-1987/08/26 
01  EUGENIO  8.;  LESMO  L. 

Untv,,  Torino,  IT;Un1v.,  Torino,  IT 
M&otre  Congris 
ENG 
XT 

Morgan  Kaufeann  (Los  Altos) 


VOL  2/2;  po,  648-654;  22  Ref.;  3  Fig.;  OP,  1987 

0-934-61343-5 

05:  M  5234 

Proposition  d'un  formalism#  de  reprisentatlon  sinantlque 
susceptible  ce  trailer  les  proolimes  de  rifirence  e'est  i  dire 
c' interpreter  ces  sequences  de  mits  (notamment  celles  dibutant  par 
un  arttcUKCe  forir.al1s*e  suit  une  approche  de  riseau  sAmantlQue 
et  util  iso  diffirents  plans  de  reprisentatlon  (simantlQue. 
contenu,  rifirence)  ainsi  cue  certralnes  structures  particullires, 
cenommies  espaces  d'aablgufti.  Qul  disslmulent  les  aaoigultis  et 
restent  neutres  vis  4  vis  ces  ctverses  interpretations  possibles 
Juscu'i  ce  out  les  aeoiguitis  solent  levies. 

XNFO/CR 
05  07 

SimantlQue*;Xnterpritation;Llngu1st loue  automat lsie:Pnrase 
grammalre;Traeuctloo  machine 

Langage  naturel*;6ase  de  connaissance;Fonctlon  anoiguUA.Grammaire 
syntacticue 


C-88-005301 

Colloque  sur  le$  tecnniques  o'ivaluation  pour  conception  ae 
systimes  interact  if s  (2e), 

Colloquium  cn  evaluation  tecnniQues  for  interactive  system  design 
:  XI. 

London,  G8 
1987/10/02 

XEE  Ccoout.anc  Ctrl .Division  (G8) 

Congris 

ENG 

GB 

XEE  Colloquium  Digests  (G8) 

XEE,  London 

NO  1937/78;  26  D.:  2  Ref,:  1  Fig.:  3  Tabl.e  7  risumis:  OP.  1987 
XEECD8 

05:  Me  131-4 

Proolimes  pratiques  rencontris  dans  1 'analyse  ce  Connies 
d' Interact  ions  temps  riel  hoerme-aacMne.utitlsatlon  d'iquipements 
vicio  cans  le  processus  de  concept  ion, Analyse  Ce  Questionnaires 
sur  la  satisfaction  oes  utilisateurs  de  r  Informal  iQue.Appl  leaf on 
de  techniques  statist iques  au  processus  oe  conception  en  vue  de 
mlntmiser  revaluation  expirimentale.Etuce  ces  avantages  d'une 
planlf i cat  ion  en  profonceur  rigoureuse  cans  le  domaine  ce  la 
recnerene  appllQuie. Description  d'une  mitnocologie  oe  conception 
ce  systi»es  oe  traltement  d#  oonnits. Presentation  d'un  ©util  oe 
conception  pour  interfaces  grapniQues  Interactions. 

XNFO/CR 
05  08:  09  02 

Relation  nomm*  macnine*;Af f icnage  grapnicuo  interact1f*:lnterfaeo; 
Recnerene  appHQuie;Planif icat ion  projet;Assurance  cualttA; 
Traduction  macnine:Psychomitrie:Conception  asslstie  par 
calcuiateur-.Evaluation  performance 
Systime  1nteractif*:Etuce  conception  systime 


C-88-F01I68 

Tecnnologle  nordlQue  09  potnu 
Foncs  ce  1 'Industrie  Nordlque  (NO) 

Ouvrage 

FRE 

NO 

Fonos  ct  1 'Industrie  Nordlque,  Oslo 
39  p«:  Qoelq.  Fig.;  nomor.  Phot.;  OP.  1967 
05;  M  260/824  P 

Les  telecommunications  par  satellite. La  coordination  ce  la 
fabrication  oe  circuits  VLSI .La  riot ilisat ion  systimatiQue  oe 
modules  de  program*. La  ••traduction**  ••asslstie**  par 
orainateur.La  syntnise  ce  la  parole. Le  traitemen-  ces  images. Le 
transfert  automat iQue  sur  bance. Etude  des  enzymes. aiotecnnologle 
•t  protiines.Les  systimes  o'aice  i  la  navigation  maritime  par 
gestion  informal isie  interact ivt.Proir litis  micanlcuei  ces 
ailiages  sounls  4  soHoif ication  et  extrusion  rapides.Nouveaux 
champs  d'appi icat ion  pour  les  composites  ipoxy-flores  de  carbooe. 
INFO/AN 
J4  06 

Recnerene  diveloppe®ent*PDanemark;F1nlance;Islande;No-vige«$uiae; 
Traduction  eacnlnejTraltcment  image;Langage  programmatlon: 
Transfert  informal  ion, Navigation  rant  ime;  Til  icoowun!  cat  lor.  par 
satellite. 

Coociratlon  scientlf ique*;D4velOPPe*ent  tecnnol eg Ique*; EUREKA 
projet; In? igrat ion  tris  grande  4cnolle;Pays  nordlque 


C-88-F00069 

Intelligence  artlflciello  et  systimes  experts. 

Convention  informatiaue.L'informatlque  s  cu  dl scours  1  la  mitnode. 
Paris  (FR) 

1986/09/15-1986/09/19 

direction  ces  Industries  Electronlques  et  ce  1 'Informal loue  (FR) 
Mimolre  Congris 
FRE 
FR 

DIELZ.  Paris 

VOL  A;  pp.  45*60;  nembr.  Fig.;  OP.  1986 

2-902-57421-5 

05;  M  5743 

L'acquisition  do  la  maitrise  incustrielle  en  systimes  experts  : 
else  en  oeuvre  ce  systimes  experts  dans  une  entrepr isu. Unt 
experience  riussle  de  transfert  tecnnologlque  entre  la  recnerene 
et  une  application  ooirationnelle  ;  rOie  et  action  de  diffirents 
intervenants  du  monce  de  la  recnerene  et  de  I'entreprlse  4 
I 'i-ccasioo  cu  civeleppesent  d'un  Systime  Expert  en  diagnostic  ae 
>nannes.Le  tra<tcment  automat ique  cu  langage  ;  1'avenlr  de  #*  TAO 
(••Traouctionf*  ••Asslstie**  par  Oroinateur).  ces  interfaces  dt 
dialogue,  o'»;na.4  la  ridaction,  machines  4  dieter  et  de 
comprir.enslonvd'e'texta.  - 


B4 


1NF0/VT 
09  02 

Xnf or mat  1  cue* ;  Syst !se  expert*; Intelligence  artificial*'; 

Traouct»on  machlnt;0iagoostic;Panne:Tr4ltement  automat  ique.  Connies; 
Text# {Reconnaissance  parol#;R4aact lon;£tuce  c!velopoe«ent 
Traitement  text* 


C-87-OI3628 

Intelligence  artificial!*  s  outils.  mitnodes  « t  applications. 
Convtntion  inforMticue.L'Informat  ique  t  cu  Ciscours  i  la  »JthoCe. 
Paris  (FR) 

1938/09/ 15-1986/09/19 

Direction  dts  Industries  Electronlaues  et  ce  1 'Informal l que  <FR) 
Minbire  Congris 
MOL 
FR 

OIELI.  Paris 

VOL  A;  pp.  124-195;  nomcr.  Ref..  no»cr.  fig.,  anglais,  frangais. 
OP,  1936 
2*902-57421-5 
05:  M  5743 

Recr4sentation  ces  ccnnaissances  (langages  et  mitnoces)  : 
architecture  d' intell igonca  service,  introduction  a  0PS5, 
apprcntissag*  cc  concepts,  CURU  ;  un  outil  ce  civeloppesent  at 
sy$tt?4$  experts  Cans  1#  monce  g*  la  gest ion. Architecture  C# 
cinculim*  giniration  :  architecture  ces  machines  u$p.  eitnoces  d« 
prograsrsation  para 11 41 •  O'une  architecture  Miuo.  DOC  :  Delta 
Driven  computer,  un#  architecture. pour  l' intelligence 
artificial!*. ces  systimes  experts  aux  systimes  a  base  ce 
connaissance  .  1' intelligence  artificial  It  face  aux  techniques 
inforoat lQues,  application  ces  systimes  experts  4  la  conception 
C«s  cases  o*  connies,  1'apport  ces  techniques  ce  4 'I, A. cans  un 
environnetcent  Dureautloue.ApplIcatlons  ces  aspects,  reconnaissance 
ces  forces,  langue  nature!!#  ce  IMA  :  le  projet  Ce  "tracuct  Jon*  • 
••assistie**  par  orqlnateur  (TAO)  langage  natural  et  recnercne 
docuoentaire.  giniration  autoeaticue  ce  textes  en  langues 
naturelles.  la  reconnaissance  co  la'paroie. 

IMFO/VT 
09  02  ;  06  04 

Informal  lout* {Intelligence  artlflcieileMBase  conn!*:Syst4«* 
expert {Architecture  calculateur :8ureautlque;Reccnnalssanc*  forme; 
Traduction  micnlnejReconnaissance  parolejRecnercne  cocuoentaire 
Arcnitecturt  systia*;LXSP  langage  programmatlon;8ase  oe 
connalssance;Repr4stntation  connalssance:Orainateur  cinqulim* 
generation 


C-87-011588 
Langage  nature!. 

Natural  language. 

9th  International  Joint  Conference  on  Artificial  Intelligence. 

Los  Angeles  ((JS) 

1985/08/18-1985/08/23 
ijcai/aaai/ucla  (us) 
v^moire  Congris 
ENG 
US 

western  period  leal s  Co.,  Nortn  Hollywood 

VOL  2/2 t  PO.  749-885;  npr.  Ref.;  nor.  Fig.;  DP.  1985 

0-934-61302-8 

05:  M.5234 

pris  ce  trente  mimolres  coosacris  aux  langages  naturels  traltent.- 
C'analyses  jyntaxlqut  et  syntagaatiaue.  ce  gramaires, 
c'  interact  ions  entre  syntax*  et  siaantique.  ce  giniration  co 
langape  nature!  et  c*  traduction  automat ique,  ce  processeurs  ce 
langage  d'analyse  ce  langue  cnlnolse.  ce  structures  ce  ciscours 
ce  risuais  ce  textes,  ce  conversation  hornme-aacnine,  a'aecis  i  ces 
oases  ce  Connies  intelllgentes,  c*  systime  c' informal  ion 
intelligent  ce  traite»ent  c'infirences  «n  ligne  cirtcte,  c  analyse 
ce  conjonctions  4  1'aice  cu  langage  Prolog,  ce  reconnaissance 
automat ique  ce  parole. 

InFO/CR 
05  07;  09  02 

Linguist iQuo  automatl*4e*:Langage  programmat ion* {Relation  honm# 
machine ;S4mantl quo ;Gramma ire {Intelligence  art  if  idellejLangage 
inCepencant  cbntexteslracuctioo  oacnine;Langue  cninolse;Syst!»e 
informat'on;Progra«iat}on  cynaolque 

Langage  nature)';Tft4orie  langage; Ana lyse  syntaxicuesReconnaissance 
automatlaue  parclcjMecnlne  turing:Auto«ate  f inljlraitesent  text* 


C-87-004473 

Kltnoce  d'analyso  syntaxique  cu  langage  natural  par  un#  procicore 
ce  filtrage. 

A  parsing  mothod  of  natural  language  5y  filtering  procecure. 
SArtAKI  H,J  MASH1MOT0  K.J  SU2UK1  M.;  NOGAlTO  I.‘,  TAMARA  T. 

Kokusai  Oenslim  Den*a  Co.,  Tokyo,  JP;Kokusai  Oenslio  Den*»a  Co., 
Tokyo,  jP;Xoku$ai  Oenslio  oenna  Co.,  Tokyo,  jP;ACvanceo 
Telecoo. Res. Inst. Xnt.:  Osaka-sni.  JP;Soft«ar#  Consult. Co..  Tokyo. 
JP 

PuDHcJtion  en  s!Me 

ENG 

JP 

Transactions  of  tne  Institute  of  Electronics  ana  Communication 
Engineers  in  Japan  (JP) 

VOL  E  69:  NO  10:  DP.  1114-1124;  6  Ref.;  8  Fig.;  OP,  1985/10 

TIEEOU 

0337-236X 

05;  P  1725 

Prisentatlon  c'une  mithoc#  d'analys#  syntaxique,  basie  sur  un* 
extension  c*  -LINGCL',  component  ceux  Itapes  :  une  cicomposition 
arborescent*  sans  limitation,  sulvie  c'une  llinination  ces 
oranenes  incorrectes  grace  4  un  f litre  aoprocri!  utllisant  Ces 
notions  d'#arbre  interdit'  et  d'*arbro  prlviUgi!*. Description  cu 
tralteaent  des  socmen  logiques  *00*  peroettant  d'eaprlnef' 


plusleurs  arpres  sur  un  arbre  unique. Etuco  oe  i'applicaticn 
cratlQue  ce  cette  mithoce  c' analyse  cans  l«  syst ioe  'KATE*  c# 
traduction  autcmatJQue  d'anglais  en  japonals. 

INFO/CR 
05  07 

Tracuct ion  macnlnt' 

Analyse  syntax1qu**;Langage  naturel*;Proc4cure  arborescent#: 
Tracuct ion  cirigie  syntax*; Analyse  assist!*  par  calcu1ateur;F11tre 
reject cur 


811-87*0003 14 

Applications  ce  mlcroorocesseurs  4  r intelligence  artlflciellc. 
Microprocessor  applications  in  art  if  idol  intelligence. 

12  EUROMICRO  symposium  on  microarcnitectures.  ceveloo»ents  anc 
applications. 

Venue.  IT 

1986/09/15-1986/09/18 

EUROMICRO  Assoc. for  microprocessing  and  microprogramming  (nl) 
M&oire  Congrls 
ENG 
NL 

Elsevier  Science  Publishers  6. V. (Amsterdam) 

DP.  69-95:  N8  Ref.;  NS  Fig,;  3  a*roires;  OP,  1986 

0-444-70096-X 

05:  M  5872 

Proposition  o'un  pr!processeur  tnt!gr4  pour  1 ‘execution  ce 
programmes  en  PROLOG  bas4e  sur  un  ensemble  a' instruct  ions  ce  type 
warren.Etuce  c'une  nouvelle  structure  cu  calcul  pour  la  tracuction 
cu  langage  LIS*  applicable  aux  mtcrocalculateurs. Prisentatlon 
c'une  architecture  co  logiciel  pour  esticateur  ce  position  de 
robot  mooli*  dans -un  envlronneaent  limit!,  burcautique  ou 
acmes t ique. 

INFO/CR 
06  04;  09  02 

Intelligence  artif icieile*;Mtcroorocesseur*;S!quence  instruction; 
Architecture  ca1culeteur;Micrc1nstructlon;Traduct1on  machine; 

Coop 11 at eur; Robot 

Priprocesseur;PROLOG  langage  prograomation:LXSP  langage 
prograwnat  ion 


BM-87-000105 
Reconnaissance  ce  forme. 

Pattern  recognition. 

Applications  of  artificial  intelligence  XII. 

Orianco  (US) 

1986/04/01-1988/04/03 

The  International  Society  for  Optical  Engineering 
Mimoire  Congris 
ENG 
22 

Proceedings  of  SPIE  (US) 

SPIE.  8#mngnao 

VOL  635:  pp.  439-496;  noobr.  Ref.;  nomor.  Fig.;  nofflbr.  Tab}.;  6 

comuni  cat  tons;  OP.  1986 

SPIECJ 

0-892-52670-X 
05;  Me  10828 

Presentation  C'une  mithoce  ce  reconnaissance  automat iaue  ces 
cnangeoents  Ce  primitives  Cans  les  signaux  ce  command#  numirique 
Ces  raaentnes. Descript  Ion  C'un  syst!«*  fiable  ce  tracuction 
autom4tlaue  per  calculateur. Application  o'un  systime  expert  pour 
les  diagnostics  de  la  oidecine  cnlnolse 
tradltionneile. Reconnaissance  c*  caractires  appartenant  4 
plusleurs  polices  en  utllisant  ces  techniques  d'apprentissaQe. 
INFO/HO 
06  04 

Reconnaissance  forme* {Reconnaissance  caract!re; Intelligence 
artif ici*1i*;$yst!?*  comoance;Conmaru3e  nua!riQue; Tracuct  on  machine 
Diagnostic  cllnique:Syst!me  expert 


C-86-012650 

L'  Intel  licence  artificial#. 

Artificial  intelligence. 

SATO  S.i  SWIMOTO  M. 

Publication  en  s!rie 
ENG 
22 

Fujitsu  Scientific  anc  technical  Journal  (JP) 

VOL  22:  NO  3:  PO.  139-181;  66  Ref,;  36  Fig.;  4  Tabl.;  OP,  1986 

*USTA4 

0016-2523 

05;  P1708 

Etat  Ce  1'art  ce  la  recnercna  et  du  C!v#1opoe«ent  en  mat  lire 
d' Intelligence  artif ictelle  dans  la  Soclit!  Fujitsu  qui  partlctpe 
actual  lament  4  un  projet  d'ordlnatrjrs  de  la  clnqulime  giniration. 

INFO/TT 
09  02;  05  04 

Systise  expert ^Intelligence  artif lcielle*;Traouction  machine*; 
Traltement  informat ion;Reconnalssance  pa role; Langage  progfa.»nat ion; 
Mimblre  virtue! le  ca1cuUteur;Conception  assist!#  par  calculateur 


C-86-010787 

OJxl&ne  congris  International  sur  la  lingulstlque  autematisie. 
tOtn  international  conference  on  co*outat local  linguistics. 22nd 
annual  meeting  of  tne  Association  for  computational  linguistics. 
Stanford,  US 
1984/07/02-1984/07/06 

Association  for  Computational  Linguistics  (US) 

Coogris 

ENG 

US 

ACI  (US) 


551  p.:  HP.  Ref„:  no.  Fig.;  DP.  1984 
052  N  3242 

♦•Traduction**  «r  “-machine**. Analyst  graneattcale. Analyst 
simanticue. Interfaces  en  langage  nature’ .Analyst 
sy nt ax  1  qua. Analyst  syntact iQue.Lexlcograpnie. Comer ihenslon 
autocat ique  Cts  textes. 

1KF0/G0 
05  07 

Traduction  machine* jLIngulst leue  automat  1st®* 

Graseaire  syntact Iqu®; Analyst  syntaxlQut:Langage  natural: 
Ltxicograpnle:Olscoufs;Base  ce  connalssanct 


'-66-010286 

v.i  langige  slrple  d'acplicatlon  :  It  mini  M-L. 
a  sinolt  applicative  language  ;  mini  M-L. 

CLEMENT  0.:  DESPEYROUX  J.:  DESPEYROUX  T»*  KAHN  G. 

SEHA  (FR):INRJA  Sophia  Antlpolis  (FR’jXnria  Soohla  Antlpolis 
(FR):INRIA  Sophia  Antipolls  (FR) 

INRIA,  it  Chesnay 

Rapport 

ENG 

FR 

529 

Rapport  at  recherche:  15  p.s  15  Ref.:  12  Fig.*  OP.  1986/05 

0249-6399 

05:  M  5208-4 

Description  formelle  ae  la  partic  tssentlelle  cu  langagt  ml  en 
SAeant leut  Naturtllt.Lts  simant louts  statlout  tt  cynaaique  sont 
tralties  ainsl  cut  la  **tracuctioo**  vtrs  unt  “machine** 
aostratte. Cette  oescriptlon  a  fait  1'oojet  ae  virlf icatiors  sur 
oramattur  tt  nous  txpliouons  poureuo*.  cts  virlf lcatlons  sont 
possibles. Un  certain  oo«are  ae  proorlitis  ou  Jangag®  s'exprlment 
alsiment  cans  le  context®  oe  ctttt  eitnoce  et  nous  Its  c&nontrons. 
XNFC/LP 
09  C2 

ungagt  programmat  ion* 

Calcul  ia»cC4*:Formule  implieite:Cocage  numirlque 


C-86-009U6 

Numiro  special  consacri  au  traitement  cu  langagt  natural. 

Special  issu®  on  natural  language  processing. 

PuDlicatloo  en  sir  it 
ENG 

zz 

Proc.of  tne  IEEE  (US) 

VOL  74;  SO  7;  Dp.  899-1039;  no.  Ref.:  No.  Fig.;  no.  Taol.:  DP, 

1986/07 

IEEPAO 

0018-9219 

05:  P  0739 

La  representation  aes  coonalssances  ®t  l®  tralterent  au  langagt 
nature). Le  langag®  nature)  et  Its  experts  artlf tclels.Lcs  mociles 
ut 1 1  isatturs  foncis  sur  1®  dialogue. La  generation  cu  langag®. La 
machine  i  traauire  :  perspectives  curopionne.  aoirlcalne  ot 
j a ponal it. Evaluation  ots  systimes  ct  traltesent  cu  langagt  naturel. 
INF0/T7 
05  07 

Langage*  intelligence  artlf Iciell®*, Traduction  eacnine*:$ysti-»e 
expert *;Traduct ion :Langag*  inciP®ncant  cootoxt®:Tneorlt  grapnt; 
Programme  calcuiateur;Systime  nonme  machine 

Tracuction  dlrigie  syntax®*, Langag®  naturel'.Base  c®  connalssanc®* 


C-8 5-004403 

Atlas:  systfcoe  c«  tracuction  automat ique. 

Atlas:  automatic  translation  system. 

UCHIDA  H.*  HAYASH1  T.:  KUSHIKA  H. 

Fujitsu  (JP) ;FuJ'<  tsu  ( JP)jFuJ ItSU  (JP) 

PuDHcation  en  sirle 

ENG 

JP 

Fujitsu  (JP) 

VOL  21;  NO  3:  po.  317-329:  13  Fig.:  OP.  198S/ET 

FUSTA4 

05:  P.1708 

Oeux  r-acnlnes  ce  traduction  ertz  Fujitsu  Atlas  It  basie  sur  la 
syntax®  tt  Atlas  II  oasie  sur  la  simantlque.Cn  explicue  Its  ceux 
miccnlsmes  ce  tracuction, 
info/go 
05  07 

Traouct Ion  machine* ;S4mant lQu«:Syntaxe 
Tracuction  dlrigie  syntax# 


C-86-003758 

Preparation  d'une  oast  ce  Connies  ce  langut  anglais®  en  llgne  pour 
1' information  scientiflcue  et  tecnnloue  Jaoonaise. 

Preparation  of  an  online  Englisn  language  database  for  Japanese 
scientific  and  technical  information, 

9tn  international  online  information  meeting. 

Loncres,  Gf* 

1985/12/03-1985/12/05 
MCRITA  a.;  SATO  M. ;  NISHI0A  R. 

Tn®  Japan  INF. CENT. of  Scl.anc  Technol..  JICST.  JPjThe  Japan 

INF. CENT. Of  Scl.anc  T«cnnol.,  JICST.  JP;Tht  japan  INF. CENT. of 

Scl.anc  Tecnno).,  JICST,  JP 

Mimolrt  Congres 

ENG 

JP 

Learned  Informal  Ion. Ox fore  and  New  Jersey  (08) 
pp,  61-67:  2  Fig.;  I  TaDi.:  op.  1985 
0-904-93350-4 
05:  «  5789 


BS 


east  ce  Connies  en  anglais  A  1 'usage  ces  itrangers  creie  en  1985 
par  le  Centre  conformation  Japonalse  oe  science  et  c® 
tecnnologio.Caractinst Iquts  ou  flcnier  JICST,  systimes  0® 
conversion  cu  japonals  «n  anglais,  avenlr  cu  flcnl«r.L*  servic®  tn 
llgne  ce  la  case  ce  Connies  en  anglais  sera  opirat tonne!  en  1986 
sur  le  serveur  JOIS. 

INFO/AN 
05  01;  05  02 

Bast  ce  connie*: Information  t#cnnlQut*;Japoo; information 

sclent if Ique tlangue  ang)atse;Traduct1on  machine 

Systiee  conversatlonnel  interacilf ;Coopiratlon  scientiflcue 


C-S6-P00678 

Sim®  cor, gris. Reconnaissance  Ces  1 Of-mcs  tt  Inttlllgenct 
artlf icle)lt.2  tomes. 

Siae  Ccngr is. Reconnaissance  Ces  Formes  tt  Inttlllgenct 
Artiftelelle.2  Tomts. 

Grenoole  (FR) 

1985/11/27-1985/11/29 

AFCET  (FR).AgenCe  Ct  I 'Informat leu®  (FR)  INRIA  (FR) 

Congris 

FRE 

FR 

AFCET,  Paris 

VOL  1-2;  1283  p. :  nor.  Rtf.:  nor.  Fig.:  nor.  Tael.;  OP,  1985 

2-903-67711-5 

05:  M.5836 

HistoMque  ct  l 'inttlllgenct  artlf iclelie. **Tracuction** 
**assl$tie**  par  orcinateur.Les  systimes  Cognttlfs  (Lore.  Prolog, 
it  systim®  KacMn  2). La  segmentation  c' Images,  roooticue,  dialogue 
oral  homme-macnlne.  langage  natural  (not»*v*nt  ,  un  systi*®  expert 
ce  traitement  ces  textes  icrltsl.Oif Inltlon  c'une  case  ot  Connies 
C’ Images  ot  tilicitection.un  systime  pour  >a  construction  Cun® 
nlirarcni®  c' experts  en  tiliditectlon.Reprisentation  ces 
connaissances  pour  un  systim®  c'aice  Intelligent .vision  coolie  et 
stirio. 

INFO/TT 
09  02:  06  04 

Systim®  expert*: Intelligence  artlf Iclelie* Reconnaissance  forme*. 
RoDot;TilicitectloniR®latlon  home  machine ;Vtstoo  stirioscootQue 
Rcootioue*;7rauem®nt  texte*;Langage  nature)*;Prolog  langage 
prograamatlon 


BM-86-000424 

L'  Intelligence  art  if iclelie  acoliauie  A  la  tracuction. 

Al  fine-tunes  speech  recognition. 

GALLAGHER  R. 

Puolicatlon  to  sirie 

ENG 

ZZ 

Electronics  (US) 

VOL  59;  NO  20:  PP.  24,  25;  OP.  1986/05/19 

ELECAD 

0883-4989 

OS:  P  0213 

La  firme  ttsllenne  Olivetti  escoepte  c®  grancs  progris  cans  l®s 
systimes  ce  tracuction  par  emoloi  Ce  l'lntelllgence 
artlf Iclelie. Application  au  renselgnement  allltalre. 

INFO/VZ 
06  04;  05  04 

Intelligence  artlflciell«*:Rens®lgneaent  miiita1re*:Traauct1cn 
macMne*:Itall®;Etuce  conception  matirte) 

01 lv#ttl  socliti* 


TIB/A89-82002/XA0 

Comouter-aiced  Saarbruecken  Translation  Service  STS. Final  report 
o»  the  MARIS  project. 

Coffoutergcstuetite  Saaroruccxer  T.-anslatlonsservlce 
$TS,Absch)ussb«richt  ces  Projexts  MARIS. 

ZIMMERMANS  H.  H.:  LUCXMARDT  H  0. 

linlversitaet  ces  Saarlances.  Saarbruecken  (Germany, 

F.R.),Facnricniung  informal ionswissensenaf t, 

Suncesalnlster Jum  fuer  Forscnung  unc  Technologic.  Bonn  (Germany, 
F.R.). 

0199850)6 

Report 

GER 

OE 

In  German.vrroeffentlicnungen  cer  Facnricntung 

informat lonswissenscnaf t,  with  50  refs:  np«  265;  OP.  May  89. 

U900I 

NT  IS  Prices:  PC  £07 
BVFT  10)3209/2 

The  MARIS  project  (oultll ingua*  aoptication  of  rcference-orientea 
information  systems)  has  investigated  tne  scientific  anc  tecnnlcal 
preconcitions  for  the  application  of  computer -a iced  anc 
••machine**  **trm$)atlon**  in  the  field  of  specialized 
informat  ion. MARIS  nas  establlsned  a  **coeputer**-**aicec** 
••translation**  service  for  the  translation  of  specialized 
information  from  German  cata  bases  into  English. Tne  report 
reflects  tne  asscntlal  aspects  of  the  project  work:  integration  of 
••machine**  ••translation**  into  a  translation  service, 
man-machine  interaction  at  a  translator's  workbench,  development, 
storage,  anc  use  of  terminology  in  cooputer-alced  anc  **macn1ne** 
••translation**,  multilinguality  of  specialized  information, 
remaining  proolems,  tecnnlcal  aspects,  concrete 
transiatlons.(oMg,).(Ti8t  FR  2736.)  (Copyright  (c)  1989  oy 
FIZ.Citatlon  no. 89:082002.). 

92  04*  88  02 

Computer  programs* jUacnino  translat1on*;lnforaatlon  systems: 
Linguist ics;Languag®  programAlng;Olctionaries;lndexes 
OocomentatlonjMan  machine  systems: Personal  computers 
Foreign  technology* ;NTISTFFIZ:NTISFnGE:NTISLNGER 


B6 


BB89-868913/XA0 

Chinese  awl  Japanese  language  Translation  by  Computer. January 
1975-Auggjt  1989  (Citation*  fro®  the  INSPEC:  Information  Services 
for  tnt  Pnystcs  and  Engineering  Communities  Database). 

National  Technical  Information  Service.  Springfield.  VA. 

055665000 

Report 

Eng 

US 

Rect.  for  Jan  75"Aug  89;  Supersedes  P387-863098:  NP.  59:  OP.  Aug 
89. 

U8920 

NTIS  Prices:  PC  N01/MF  NOl 

This  bibliography  contains  citations  concerning  researen  ana 
development  of  coraouter  hardware  and  soft-are  for  the  language 
translation  of  Chinese  and  Japanese. Computer  technology  in 
character  recognit«oo,  sentence  analysis,  text  input  and  output 
systems,  automatic  language  translation  systems  for  personal 
computers,  ana  character  generation  and  analysis  are 
discussed. Translation  techniques  for  Cninese-to- Japanese, 
Chinese-to-Engllsn.  and  Japanese-to-Engllsh  are 
presented.Applications  in  business,  utilities  management,  and 
library  automation  are  included. (This  updated  bibliography 
contains  100  citations,  40  of  which  are  net.  entries  to  the 
previous  edition,). 

92  04;  68  05 

8ibllograpnies*;»Uchine  translat ion*;Cninese  ianguages*;Japanese 
languages*: Automatic  language  processing*; Input  output  devices 
Computers ;Comou ter  systems  hardwarttCcmcuter  systems  programs: 
Character  recognition 

Chinese  language  translation*; Japanese  languagi  translation*: 
Publ  isned  Searches  »NTI$nTI$H»NTISN£RACO 


pr«9-e67931/XA0 

••Machine**  ••Translation*4;  Foreign  language  Translation  and 
Natural  language  Uncerstandlng.January  1970-July  1989  (Citations 
from  the  NTIS  Database). 

National  Technical  Information  Service.  Springfield.  VA. 

05S665000 

Report 

ENG 

US 

Rept.  for  Jan  70-Jul  89:  Supersedes  P887'868349;  np.  68;  DP.  Aug 
69. 

U8920 

NTIS  Prices;  PC  N01/MF  NOt 

Tnis  plbliograony  contains  citations  concerning  research  and 
development  of  machine/mecnanical  foreign  language  translation  Dy 
computer. Top ics  include  syntactic  and  semantic  translations, 
natural  language  representation  ano  understanding,  knowledge  based 
systems,  language  manuals  for  ideographic  machines.  Systran 
••maentno**  ••translation**,  mathematical  linguistics  and  logic, 
foreign  technologies  and  language  translation,  process??  for 
Question  answering,  and  Chinese  iexicograony.and 
romanizat ion. Methods  and  systems  for  translations  of  Russian. 
German,  Chinese,  and  Japanese  to  English  are  presented. (This 
updated  bibliography  contains  126  citations.  30  of  wnicn  are  now 
entries  to  the  previous  edition,), 

92  04;  88  05 

8lbllographles*;Macnine  translat lon*;AUtomatic  language 
procesting*;Cooputational  Hnguistics*;Syntax:Se«antlcs;Artif Iclal 
Intel  li  genet  .'Translating  {English  language:  Russ  Ian  language  ;German 
language: Chinese  language; Japanese  language 

Foreign  languages* sNatural  language* ; Pool isned  Searcnes;Vocabulary; 
NTISNTISN;NTISNERACO 


N89-23363/9/XAO 

Objectives  and  Role  of  the  Greek  National  Documentation  center. 
60U80UXAS  V.;  skourus  c.:  poulakaki  e. 

NaMonai  Hellenic  Research  Foundation,  Athens  (Greece). 

National  Aeronautics  and  Space  Administration,  Washington,  OC. 

080563000;  Nl 508359 

Report 

ENG 

GR 

In  agaRO,  the  Organisation  ano  Functions  of  Documentation  ano 
Information  Centres  in  Defence  and  Aerospace  Environments  4  o;  np. 
4;  OP.  Mar  89. 

S2716 

NTIS  Prices:  (Oroer  as  N89-23382/1,  PC  A06/MF  AOI) 

A  brief  overview  of  the  Greek  information  scene  is  pf7sent.es.The 
objectives  and  the  role  of  the  National  Documentation  Centre  are 
outlined  together  with  soot  of  its  activities  which  proved  to 
function  within  such  an  information  environment  as  well  as  plans 
for  continuity. 

83 -CO*  68  02 

Computer  programs* ;0at a  aanagement*;Inforeatloo  dissemination*; 
Information  $ystems*;Macnine  translatlon*;Gr#ece*;Languages: 
Alphabets:6lDliographies;literature:Reports 
Foreign  technology*;NTl$NASAE;NTXSFNGR 


N89-2066I/5/XA0 

Barriers  to  the  International  Transfer  of  Information  in  Aerospace 
and  Oefense. 

Contained  «iwAGARD-Cp-430  Accessioned  as  N88 -304 58. Pres anted  at 
the  Meeting  on  Barriers  to  Information  Transfer  and  Approaches 
Toward  Tneir  Reaction,  Washington.  DC,  23-24  Sep. 1987 {Sponsored 
by  AGARO. 

HARFORD  J.  J. ;  UWRENCE  8, 

American  Inst -.of  Architects  Foundation,  Washington,  DC, 

National  Aerot-Jitics  and  space  Administration,  Washington,  DC. 
070160000;  AR54160S 

Conference 


ENG 

US 

NP.  5;  DP.  1968. 

$2713 

NTIS  Prices:  PC  A02/MF  AOI 

An  overview  of  the  barriers  to  the  international  transfer  of 
information,  particularly  In  the  aerospace  and  defense  area  is 
discussed. The  role  of  tne  professional  society,  motives,  and  types 
of  carriers  are  also  discussed. 

70  OS:  84  00:  74  00;  88  00 

Communication* {information  dlsseoinat1on*;lnformat1on  transfer*; 
International  cooperation* {Problem  solving* ;Aerospace  industry; 
Economics; Informat  ion  retrieval ;Maenine  translation:Organizations. 
Pol  it ics (Standards 
Barriers*  ;NTISuO 


TIB/B89-80975/XAD 

GPSG  and  German  word  order. 

HAUENSCHIIO  C. 

TechMscne  univ. Berlin  (Germany,  f.r  )  erojektgfkucpe  Kuensttiche 
Intel llzenz  und  Textverstenen. 

Bunoesaintsteriua  fuer  Forscnung  und  Tecnnologie.  Bonn  (Germany, 
F.R.). 

030172001 

Report 

ENG 

GE 

KIT-52 

NP.  27;  DP.  Jun  87. 

U6915 

NTIS  Prices:  PC  £07 
BMFT  10  13207-1 

In  this  paper,  the  main  concern  is  raising  Questions  rather  than 
giving  answers. Tne  starting  point  is  Hans  Uszkoreit's  revised 
version  of  the  LP  (linear  precedence)  component  within  the 
formal  ism. The  author  discusses  some  problems  of  uszkoreit's 
approach  that  result  from  tno  fact  tnat  the  whole  complex 
phenomenon  of  German  word  order  is  described  at  a  unique  level  of 
linguistic  reoresentation.He  then  proposes  a  somewhat  speculative 
solution  to  some  of  these  problems,  which  is  based  on  a 
multi-level  approach  to  analysis  ano  generation  within  the  context 
of  ••machine**  ••translation**  (which  is  the  setting  of  the 
project  KlT/NASEV  and  Its  successor  K IT/FAST), (or 1 g. ).< Copyright 
(C)  1989  by  FIZ. Citation  no.89:080975.). 

92  04 

Phrase  structured  grammars* {German  word  order* {Machine 
translatlon*;Unear  precedence  component 
Foreign  technology* ;NT1STFFIZ:NTISFNG£ 


N89-19919/4/XAD 

From  ALGOL.  60  to  Ada;  Problems.  Solutions.  Feasibility, 

HUIJSMAN  R.  D.;  VAWATWIJK  J. :  PROW  C. :  TOETENEl  W.  J. 

Techniscne  Noge school  Oelft  (NetheManas).Oept.of  Mathematics  ana 
Informatics  Computer  Science. 

National  Aeronautics  and  Space  Administration.  Washington,  DC. 

016196068;  TJ479965 

Report 

ENG 

Nl 

REPT-08-41 

NP.  47;  DP.  1968. 

S2712 

NTIS  Prices;  pC  A03/MF  AOI 

Mechanical  conversion  of  Algo)  CO  programs  into  Ada  programs  was 
studied. Major  proolea  areas  include  handling  gpto's.  handling 
procedures  and  parameters,  and  interaction  with  the  environment .A 
large  number  of  Algol  60  constructs  turn  out  to  b«  hare  or  even 
impossible  to  map. Tests,  mainly  in  scientific  comoutation.  suggest 
tnat,  depending  oo  the  amount  of  effort  to  be  put  into  the 
project,  between  80  pet  and  90  pet  of  the  source  code  can  be 
translated  mechanically. Translation  based  only  on  lexical  and 
syntactical  Information  Is  possible  for  about  50  pet  of  the  source 
text. Taking  semantics  into  account  augments  the  percentage  to  80 
pet  to  90  pet  Tim  remaining  JO  pet  to  20  pet  of  the  code  can  be 
translated  partly  if  certain  forms  of  idlco  are  taken  into 
account,  tne  rest  of  the  code  being  not  mechanically 
translatabie.Tne  Inability  to  translate  20  pet-of  the  source  is 
even  worse  than  it  secms.A  manual  translation  of  the  remaining 
code  often  requires  a  complete  restructuring  of  the  program, 
including  those  parts  that  cruld  be  translated  mecnanically.Since 
translation  Is  manual  to  a  considerable  extent,  maintenance  is 
also  problematic  and  only  posstbfe  when  applied  to  tne  resulting 
Aoa  programs.Unfortunately,  the  readability  and  recognizablMty  of 
the  latter  are  seriously  impaired  by  the  consequences  of  partial 
manual  translation. 
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In  this  report.  Investigation  results  of  knowlecge  structure  in  2 
nuclear  c*ta  evaluation  cooe  are  described. Tnis  investigation  is 
related  to  th*  natural  language  processing  ana  the  Knowledge  Base 
In  the  rtsearen  theme  of  Hunan  Acts  Simulation  Program  (HASP) 

Begun  at  the  Computing  Center  of  JAERI  In  1987, By  using  a 
•'machine**  ••translation**  system,  an  attempt  nas  been  made  to 
extract  a  deep  knowledge  from  Japanese  sentences  which  are 
equivalent  to  a  FORTRAN  program  CASTHY  for  nuclear  oata 
evaluatlon.xitn  the  knowledge  extraction  method  used  By  the 
authors,  th*  verification  of  knowledge  Is  more  difficult  than  that 
of  the  prototyping  method  in  an  ordinary  AI  technique. In  the  early 
stage  of  Building  up  a  knowledge  base  system,  it  seems  effective 
to  extract  and  examine  knowledge  fragments  of  limited  objects, (ERA 
citation  14:014093). 
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as  part  of  a  project  to  oeveloo  a  Japanese-Engllsn  “machine** 
••translation**  system  for  tecnnlcal  texts  within  a  Halted 
domain,  we  conoucted  a  study  to  investigate  the  roles  that 
suolanguage  technique*  and  ooerator-argument  gras-mar  would  play  in 
the  analysis  and  transfer  stages  of  the  system. The  data  consisted 
of  fifty  sentences  fro®  the  Japanese  ano  English  versions  of  tho 
FOCUS  Query  Languago  Primer,  which  were  decomposed  into  olementary 
sentence  patterns. a  total  of  187  pattern  Instances  were  found  for 
Japanese  and  191  for  Engilsh.wnen  the  elements  of  these  elementary 
sentences  were  classified  and  compared  with  their  counterparts  In 
the  other  language,  we  identified  43  wore  classes  in  Japanese  and 
43  corresponding  English  word  classes. These  word  classes  formed  32 
sublanguage  patterns  in  each  language,  29  of  wnlcn  corresponded  to 
patterns  in  the  other  language, This  paper  examines  In  detail  these 
correspondences  as  well  as  the  mismatches  Between  sublanguage 
patterns  In  Japanese  and  English. The  nlgn  level  of  agreement  found 
between  sublanguage  categories  and  patterns  in  Japanese  and 
English  suggests  that  these  categories  and  patterns  can  facilitate 
analysis  and  transfer. Use  of  operator-argument  gratrmar,  wnlcn 
incorporates  operator  trees  as  an  intermediate  representation. 
suDStant laity  reduces  the  amount  of  structural  transfer  needed  in 
the  system. (EOC). 
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In  section  2  of  this  paper  we  briefly  characterize  our  notion  of 
understanding  a  text. In  section  3  wc  give  an  overview  of  the 
system  we  have  constructed  for  analyzing  equipment  failure 
messages,  and  indicate  tno  points  at  wnlcn  it  makes  use  of  domain 
information.we  then  turn  in  section  4  to  the  domain  model  Itself, 
and  describe  now  it  provides  tnt  information  needed  by  language 
analysis. we  close  with  b-lef  sections  relating  our  work  to  otner 
work  on  discourse  analysis  and  discussing  how  our  system's 
coverage  may  oe  broadened.(FR). 
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This  report  aoscrlbes  research  done  by  tn«  PROTEUS  Project  at  New 
York  University  during  the  period  January  is,  1985  to  September 
15.  1987. ah  of  the  activities  described  below  were  supoorted  in 
part  by  the  Strategic  Computing  Program  of  the  Defense  Advanced 
Research  Projects  Agency  under  Contract  noooh-k-0163  from  the 
Office  of  Naval  Research. The  PROTEUS  Syntactic  Analyzer  1$ 
intended  to  provide  an  efficient,  easy-to-use  base  for  tnt  various 
experiments  in  computational  linguist ics.Tne  basic,  long-te^m 
objective,  as  part  of  the  Strategic  Computing  Program  in  Natural 
Language  Processing,  Is  to  aeveloo  the  technology  necessary  for 
the  robust  automated  processing  of  messages  containing  natural 
language  narrative, One  aspect  of  the  development  of  such  language 
processing  systems  1$  the  incorporation  of  detailed  comain 
knowledge  ano  the  effect 've  use  of  suen  knowledge  in  language 
analysis. The  research  has  focused  on  one  type  of  message.  CASREPs 
(equipment  casualty  reports),  on  developing  detailed  d&naln 
knowledge  (a  model  of  the  equipment),  and  on  using  this  knowledge 
for  language  understanding. Keywords:  Text  processing.  Parallel 
parsing,  Semantics. (kr). 
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The  Integrated  ‘‘Machine**  ‘‘Translation**  System  PIVOT  is  a 
••machine**  ‘'translation**  system  using  a  knowledge  base  tnat 
accumulates  knowledge  of  wnat  is  to  be  translated. Tne  use  of  the 
epochal  PIVOT  system  for  “machine**  **translat Jon**  system  (inter 
mediate  expression  by  conceptual  structure)  permits  high  Quality 
translation  and  realizes  integrated  “machine**  “translat ion** 
from  Japanese  to  English  aro  vice  versa. Expansion  to  “machine** 
••translat ion**  of  multiple  languages  will  be  easy  The  paper 
describes  the  characteristics  and  functionality  of  tne  Integrated 
••Macnine**  “Translat  ion“  System  PIVOT  which  was  developed  for 
reward  such  ooject, 
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PB89- 150478  and  Volume  4),  Number  11.  PB89- 150445. Port  ions  of  this 
document  are  not  fully  legible.  Color  illustrations  reproduced  in 
black  ana  white:  NP.  196;  DP.  1988. 
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Special  issue  on  application  software.  Office  application 
systea;Software  de\ctcpment  support/operation  management 
systems ;ai  application  systems:Scienttf lc  and  engineering 
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Tne  primary  obstacle  to  access  to  tne  Japanese  tecnnlcal 
literature  is  the  Japanese  language. Manual  translation  of  Japanese 
tecnnlcal  material  tends  to  be  very  expensive  and,  especially  in 
specialized  tecnnlcal  fields.  Is  often  inaccurate, IVcnlne-alded 
translation  (MT)  offers  the  hope  of  eventually  gaining  a  ®ucn 
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broader  access  to  Japanese  scientific  ana  technical  literature. The 
report  to  the  US.Coogress  assesses  the  present  state  of 
japan#se-to-finglish  mt, Cons icerat ion  is  given  to  the  mt  process 
itself,  ana  to  current  activities  In  the  U.S.,  Japan  and 
Europe, At  tent  ion  is  also  given  to  tne  status  of  optical  Japanese 
character  recognition  devices  as  an  Input  method  for  MT  system. 
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Many  syntactic  parsing  strategies  for  "machine**  *  translation** 
systems  are  cased  entirely  on  context-fret  grammars. These  parsers 
recuire  an  overwhelming  number  of  rule$;tnus,  translation  systems 
using  rule-based  parsers  either  nave  limited  linguistic  coverage, 
or  they  have  poor  performance  due  to  formidable  grammar  size. This 
report  snows  now  a  principle-cased  parser  with  a  co-routine  oeslcn 
improves  parsing  for  translation.Tne  parser  consists  of  a  skeletal 
structure-5ul Iding  mechanism  that  operates  in  conjunction  with  a 
linguistically  cased  constraint  module,  passing  control  back  and 
fortn  until  a  set  of  uncerspeclf iec  skeletal  pnrase-structures  ts 
converted  into  a  fully  instantiated  parse  tree. Tne  modularity  of 
tne  parsing  design  accommodates  linguistic  generalization,  reduces 
tne  grammar  size,  allows  extension  to  other  languages,  and  is 
ccocatlDle  with  studies  of  human  language  process  mg, Keywords: 
Natural  language  processing,  Interlingual  translation.  Parsing, 
Subroutines.  Principles  vs. Rules,  Co-routine  design,  Linguistic 
constraints. (edO. 
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The  MELTRAN  system  achieves  high-quality  translation  through 
interpretation  rules  that  identify  special  usages  as  well  as 
general  linguistic  const ructions. New  rules  can  pe  added  to 
customize  the  system  for  specific  applications. The  system 
translates  10,000  words/h.(Copyrlgnt  (c)  1988.  Mitsubishi  Electric 
Corporation.). 
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Current  approaches  to  general  ion- for  ••machine**  "translation" 
make  use  of  direct -replacement  templates,  large  grasnars,  and 
know lecge-oa sea  Inferencfng  techniques. Not  only  are  rules 
language-specific,  but  tney  are  too  simplistic  to  handle  sentences 
that  exhibit  more  comolex  phenomena. Furthermore,  these  systems  are 
not  easily  extendable  to  other  languages  because  the  rules  that 
map  tne  internal  representation  to  the  surface  form  are  entirely 
dependent  on  both  th#  domain  of  the  system  and  the  language  being 
generated.? Inal ly  an  adequate  interlingual  representation  has  not 
yet  been  discovered; thus,  know ledge- based  inferencing  is  necessary 
and  syntactic  cross-linguistic  generalization  cannot  be 
exploited. mis  report  Introduces  a  plan  for  the  development  of  a 
theoretically  based  computational  scheme  of  natural  language 
generation  for  a  translation  systea.Tno  emphasis  of  the  project  is 
th#  mapping  from  the  lex1cal,\conceptual  structure  of  sentences  to 
«n  underlying  or  base  synta'.  fc  structure  called  deep 
structure. This  approach t*<*l  ,As  th*  problems  of  thematic  and 


structural  divergence.  I.e.,  it  allows  generation  of  target 
language  sentences  that  are  not  thematically  or  structurally 
equivalent  to  their  conceotually  equivalent  source  language 
counterparts. Two  other  tore  secondary  tasks,  construction  of  a 
dictionary  and  mapping  from  deep  structure  to  surface  structure, 
will  also  be  discussed. The  generator  operates  on  a  constrained 
grarvnat ical  theory  rather  than  on  a  set  of  surface  level 
tranforeatlons.(kr). 
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Th*  document  presents  Report  Memoranda  issued  by  the  Tokyo  Office 
of  the  u.S. National  Science  Foundation  (NFS)  during  the  first  half 
of  1986. Th#  Memoranda  included  in  tne  volume  are:  1985  Survey  of 
Research  and  Development  in  Japan;Japan  Key  Technology 
Center ;0irectory  of  Japanese  Company  Laboratories  willing  to 
Receive  American  Researchers {Japanese  S&T  Budget  for  Japanese 
Fiscal  Year  i986:Japanes*  "Machine"  "Translation"  Efforts  —  A 
Look  at  Three  Selected  MT  Systeas;A  visit  with  Or.Jlro  Kondo. 
President.  Sicence  Council  of  JatansProposeo  New  Law  to  Encourage 
Industry/Government  Cooperation  in  Science  and  Technotogy;STA  and 
RIKEN  to  Launch  1  International  Frontier  Research  System- : Japan's 
Key  Technology  Center  Selects  Twenty-five  R&O  Projects  for  Capital 
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oeveioped  in  response  to  the  Japanese  Technical  Literature  Act  of 
1986,  the  Olrectory  has  been  divided  into  four  parts. Tne  first 
part  contains  an  alphabetical  list  of  commercial  services  that 
collect,  abstract,  translate  or  disseminate  Japanese  technical 
tnformatlon.Followtng  this  are  two  Indices,  one  by  area  of 
specialization  and  one  by  state. Tne  second  part  lists  Government 
agencies  with  programs  and  services  involving  Japanese  technical 
informat  ion. in#  third  contains  libraries  in  both  tne  public  and 
private  sectors  tnat  have  extensive  holdings  of  Japanese  technical 
information  The  final  part  cites  Japanese  technical  documents 
translated  at  Federal  expense  which  are  available  to  the  public. In 
addition  to  those  directories,  the  publication  also  includes 
background  articles:  (1)  universities  that  have  initiated  programs 
to  provide  undergraduate  and  graduate  students,  as  well  as 
experienced  scientists  ar*.  engineers,  with  sufficient  proficiency 
in  Japanese  to  enable  theta  to  take  advantage  of  tne  large  amount 
of  untranslated  material  emanating  from  Japan;(2)  tn#  status  of 
japanese-to-Engllsh  "macnine"  "translation"  projects  in  the 
United  States.  Europe  and  Japan;(3)  U.S. Government  efforts  to 
inclement  the  Japanese  Technical  Literature  Act;(4)  follow-up  on 
two  c.  ae  studies  reported  In  the  1907  Oirectory;(5)  a  private 
sector  view  of  America's  readiness  to  take  advantage  of  Japanese 
tech  logy. 
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This  report  presents  an  approach  to  natural  language  translation 
that  relies  on  principle  cased  descriptions  of  grammar  rather  than 
rule-oriented  cescrlpt Ions, Tne  mode)  that  has  coon  constructed  is 
cased  on  acstract  principles  as  cevelcoeo  By  Cncmsky  (1981)  and 
several  other  researchers  working  wttnjn  the  'Government  ano 
Binding'  (G8)  framework. Tne  accroach  taken  is  'Interlingual*. 

the  rocel  is  cased  on  universal  principles  that  hold  across 
all  languages; the  distinctions  among  languages  are  then  handled  By 
settings  of  parameters  associated  with  the  universal 
princioies.Tne  design  of  the  Unuran  (UNlversal  TRANstator)  system 
is  suen  that  a  language  may  Be  cescricec  cy  the  same  set  of 
parameters  that  specify  tne  language  in  linguistic  theory.Because 
of  the  modular  nature  of  the  model,  the  interaction  ef facts  of 
universal  principles  are  easily  nanolM  By  the  systeajthus.  the 
prograemer  does  not  need  to  specifically  spell  out  the  details  of 
rulo  applications. Because  only  a  small  set  of  principles  covers 
all  languages,  tne  unmanageable  grammar  size  of  alternative 
approaches  Is  no  longer  a  Drooler. Keywords;  Natural  language 
processing.  Interlingual  ••machine**  ••translation**,  Co-routine 
cesigo,  Principles  and  parameters.  Parsing,  Thematic  su&stltution. 
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Contents;  Barriers  to  the  International  Transfer  of  Information  in 
Aerospace  and  0efense:L1ngulst1C  and  Cultural  sarrlers  to  the 
Transfer  of  Informat ionjPol It  leal  and  Economic  Barriers  to 
information  Transfor;Llngulstlc  ano  Technical  Aspects  of 
••Machine**  “Translat  loo*  “Informal  Ion  Retrieval  Systems  Evolve  - 
Advances  for  E*  ier  and  More  Successful  Use; Informal  Ion  Technology 
to  facilitate  uup  Interact lon;Word$t  Key  or  Barriers  to 
information  Transfer;Llngu1stic  Barr1ers:Translat1on 
Proo leas {Technical  Change  Needs  Organizational  Change. and  Using 
Standards  to  Break  Oown  Inforpatlon  Transfer  Barriers. 
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Contents:  the  information  Industry  and  databases  In  Japan 
(AvallaDlllty  of  Information  on  Japan  in  the  Vest,  The  development 
of  the  Japanese  Information  industry.  The  use  of  databases  in 
Japan,  The  development  of  the  database  industry  in 
Japan) j**MacMne**  ‘‘translation**  of  Japanexe-an  introduction 
(Research  wnd  development  on  ••machine**  •♦t'anslatlon**  in  Japan, 
Source  tests,  Tne  Jaoan-info  project  in  Euroce,  ‘‘Machine** 
♦‘translation**  systems  in  Japan,  Assessment  nf  •♦machine** 
•^translation**  in  connection  with  searches  lr  Japanese-language 
databases) ;0escript ion  of  selected  major  databases  In  Japan 
(Description  of  cataoases  vencours.  Nikkei  T»  acoa)j**Mach1ne** 

•  ♦translat1on**,‘«yste*s  (The  basic  linguistic  problems.  Levels  of 
translation.  ••Machine**  ^‘translation**  in  practice.  The 
combination  on-line  search/* ‘machine* •  ••translation**,  Japanese 
vs, western  usage,  Human  translation  vs. “machine** 

••translation**,  ComereJany  available  ••machine** 

••translation**  systems.  Experience* of  mt  system  users;  Honca  R4D 
Center,  Experiences  of  MT  system  users;  Oalwa  Securities 
Group) .'Examples  of  ••Machine**  “translation**  (The  automatic 
translation  of  database  searches.  Examples  of  translations 
achieved  with  pre-editing.  Comparative  translations* 
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••Machine** ‘“translation**  has  Deen  a  particularly  difficult 
problem  in  th«  area  of  Natural  Language  Processing  for  over  two 
decades. Early  approaches  to  translation  failed,  partly  Decause 
interaction  effects  of  complex  phenomena  made  translation  appear 
to  be  unmanageable. La ter  approacnes  to  the  problem  have  been  more 
successful  but  are  based  on  **ny  language-specific  rules  of  a 
context-free  nature. To  try  to 'capture  all  of  the  phenomena  allowed 
In  natural  language*,  context-free  rule-based  systems  reouire  an 
overwhelming  number  of  rules;thus.  such  translation  system*  either 
have  limited  linguistic  coverage,  or  they  have  poor  performance 
due  to  formidable  grammar  size. This  report  presents  an 
implementation  of  an  alternative  approach  to  natural  language 
translat ion.Tne  unitran  (Universal  Translator)  system  relics  on 
principle-based  descriptions  of  grammar  rather  th.^n  rule-oriented 
descrlptlons.Tne  approach  taken  is  Interlingual,  l.c.,  the  model 
is  based  on  universal  principles  that  hold  across  all 
ianguages:tne  distinctions  among  languages  are  then  handled  by 
settings  of  parameters  associated  with  the  universal 
principles.The  grammar  is  viewed  as  a  modular  system  of  principles 
rather  than  a  largo  set  of  ad  hoc  language-specific 
rules. Interaction  effects  of  linguistic  principles  are  handled  by 
the  system  so  that  the  programmer  does  not  need  to  specifically 
spell  out  the  dota'ls  of  rule  applications. Only  a  snail  set  of 
principles  covers  all  languagesithus.  tne  unmanageable  grammar 
size  of  alternative  approaches  is  no  longer  a  problem, 
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Altnougn  parser  generators  nave  provided  significant  power  for 
language  recognition  tasks,  many  of  them  are  deficient  In  error 
recovery.Of'tne  ones^tnat  do  provice  error  recovery,  many  of  these 
produce  unacceptably  slow  parsers. I  have  designed  and  implemented 
a  parser  generator  that  produces  fast,  error  recovering 
parsers. For  any  input,  the  erro-  recovery  technique  guarantees 
that  a  syntactically  correct  parse  tree  will  be  delivered  after 
parsing  ras  completed. This  Improves  robustness  because  the 
renaming  compilation  phases,  such  as  semantic  analysis,  will  not 
nave  to  deal  with  infinitely  many  special  cases  of  incorrect  parse 
trees. Tne  high  speed  of  the  parser  Is  a  result  of  making  the  code 
directly  executable  and  paying  careful  attention  to  implementation 
details. Measurements  snow  that  tne  generated  parser  run*  faster 
than  any  other  parser  examined,  including  ''zncwrittan  recursive 
descent  parser*. The  cost  of  this  fast  parser  with  error  recovery 
Is  a  slignt  increase  in- space. Althougn  this  particular  generator 
requires  IL  grammars,  me  ideas  can  be  applied  to  generators 
taking  lair  grammars.furthennore,  we  give  the  transformations  that 
allow  one  to  transform  may  LALR  grammars  into  equivalent  LL 
grammars. 
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Tnis  report  describes  the  envelopment  of  a  computer  software  fof 
converting  hex-octal,  alphanumeric  and  pure-alpna  mode  input  In 
English  Into  'phonetic  Oovanagarr characters'.  which,  can  be 
printed  through  cot-matrlx  printers  in  2  passes  of  prlnt-n«ad. 
along  with  English  text  in  the  same  lines, If  multilingual 
terminals  presently  available  in  India,  are  used.  It  requires  4 
passes  of  print -UBS  for  profiting  phonetic  Oevanagarl  characters, 
and  English  text  *!so  is  converted  into  phone t  id  Oeva^gari  script 
du'-ing  print ing, Thus,  the  software  reported  in  this*  is  an 
improvement  over  tne  facilities  currently  available  (n  the  Indian 
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Tne  presenf  state  of  verb  adoltlon  analysis  is  described  In  tne 
context  of  segmenting  natural  language  sentence?  wltnln  tne 
translation  systeo  SUSY, Exact  oata  are  given  on  treating  tne  verb 
addition  in  SUSY  (tne  operator  carries  out  tne  sentence  segmenting 
so  tnat  tne  verb  and  tne  verb  addition  are  combined),  on  verb 
addition  analysis  in  rules  and  tables  and  for  expanding  tne 
program  structure  (program  structure  plan,  unchanged  or  new 
sub-programs). The  practice  of  verb  addition  analysis  (operator 
PHRASEG)  Is  made  clear  from  some  alpnanumerlc  exaseles  (different 
sentences  ana  tnelr  segments.  wore  order,  nodes.  Min 
sentences). (HWJ). (TIBi  RO  3907  (5.3).)  (Cooyrlgnt  (c)  1988  by 
FIZ, Citation  no. 88. 080799. >. 
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Knowledge  bases  In  the  form  of  teoantic  networks  and  of  (word) 
export  systems  are  available  for  organising  knowledge. However, 
tney  are  limited  tnematlcaUy.If  one  coes  not  want  to  Holt  the 
operating  area  of  machine  parsers,  one  must  fulfil  certain 
reQuireoents.lne  data  oust  not  be  isolated  frco  each  otner:  Tne 
dictionary  ana  structural  connection  between  tne  source  and  target 
languages  of  tne  system  oust  be  productd:tne  syntactial  ana 
semantic  Information  oust  be  sufficient  for  analysis  and 
translationsand  the  data  organisation  oust  be  easy  to  expand  and 
to  correct. Tnere  was  a  detailed  examination  of  the  extent  to  wMcn 
tne  SUSY  dictionary  systeo  fulfils  tnese  requirements. Oat a  on  this 
concern  the  organisation  of  statistical  linguistic  knowledge  and 
the  linguistic  knowledge  of  tne  SUSY  dictionaries  (analysis, 
semantic,  transfer  and  synthesis  d1ct1onaries).(HKJ).(TI8:  RO  2852 
(7).)  (Copyright  (e)  1988  by  FIZ. Citation  no. *8:080027,,), 
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The  paper  presents  a  preliminary  overview  of  tne  conceptual 
language  Koto,  developed  for  knowledge  base  appllcatlons.lt* 
underlying  assumption  is  tnat  knowledge  represents! Jon  is  natural 
language  coond.Koto  is  a  way  to  represent  different  levels  of 
information  present  In  a  sentence. Inference  rules  are  presented 
tnat  allow  for  syllogistic  reasonlng.Koto  has  several 
applications:  (1)  conceptual  coddling:  (2)  knowledge 
represent*! ion;anq,  (3)  **MCh1no**  ••translation**, 
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It  is  shown  in  the  report  that  the  database  Industry  in  Japan  is 
on  tne  verge  of  becoming  a  more  dynamic  sector  fueled  bytra 
demand  for  more  selective  and  timely  information  but  also  by  new 
developments  including  those  of  ••macninr**  **translatlon**.Tne 
establishment  of  a  new  promotion  organization  and  initiatives  by 
MIT!  including  new  financial  schemes  by  the  Japan  Development  8ank 
give  further  indication  that  the  Japanese  database  industry  will 
uncergo  considerable  growth. Against  this  background  the  Research 
Policy  Institute  has  carried  out  exploratory  searenes  in  a  few  of 
the  Mjor  databases  in  Japan  covering  industrial,  technical, 
economic  and  political  fields. Tne  report  clearly  demonstrates  that 
on-line  searches  In  Japanese  language  databases  can  with  relative 
oaSv>  be  carried  out  wrier#  good  teiepnone  lines  are  available. It 
has  also  been  demonstrated  that  such  searches  make  it  possible  to 
obtain  critical  information  more  selectively  and  more  quickly  than 
traditional  ways  of  obtaining  the  same  lnformatlon.lt  is  also 
demonstrated  tnat  Japanese  databases  contain  pertinent  information 
which  may  ordinarily  be  difficult  to  obtain. Table  of  Contents:  The 
database  Industry  in  Japan; Sear enlng  in  Jipaneso 
oat abases. Comments  on  some  databases  for  economics,  and  science 
and  technology :Actual  searenes  in  selected  catabasesiOnllne 
database  searcnlng  procedure, Evaluation  of  s«arcnes;**Machin*** 
•*trans1atton**:Major  Japanese  databases  -  Selective  llst1ng;Cost 
of  on-line  searcning;Llst  of  manuals  and  thesauruses. 
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approach  to  artificial  intelligence  and  related  product 
overview; At  oriented  computer, The  programming  environment  of  At 
languagosjA  tool  for  expert  sy$te«S:An  expert  system  of  computer 
operation  and  utiHzat ion. Integrated  **macnine**  ‘’translation** 
system  PIVOT; Fundamental  research  on  artificial 
Intelligence: Researches  on  artificial  intelligence 
applications; Problems  In  artificial  Intelligence. 
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This  bibliography  contains  citations  concerning  machine  assisted 
language  translation  and  computers  tnat  manipulate  several 
languages  with  dissimilar  alphabets. Software  packages  tnat 
translate  Chinese,  French.  German.  Italian.  Japanese.  Spanish,  and 
Arabic  to  English  are  discussed,  along  witn  software  that 
translates  English  to  other  languages. Word  processors  and 
computers  tnat  manipulate  Hebrew/English.  Arablc/Englfsn. 
Arablc/Frencn.  and  Englisn/Chlnese  character*  are 
included. (Contains  97  citations  fully  indexed  ano  Including  a 
title  list.). 
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This  oiollograchy  contains  citations  concerning  researcn  ar,c 
development  of  machine/eecnanicai  foreign  language  translation  by 
computer. Topics  induce  syntactic  and  semantic  translations, 
natural  language  representation  ano  understanding.  knowledge  based 
systems,  language  manuals  for  ideograpnic  machines,  Systran 
•♦machine**  "translation**.  m tneoatical  linguistics  ana  logic, 
foreign  technologies  ana  language  translation,  processes  for 
Question  answering,  ana  Chines#  lexicography  ana 
romanliatlon.Mothoas  ana  systems  for  translations  of  Russian. 
German,  Cntnese.  ana  Japanese  to  English  are  presented. (This 
updated  Bibliography  contains  96  citations.  25  of  which  are  new 
entries  to  the  previous  ealtloo.). 
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The  1981  ceslgn  specif icatlons  for  the  Egyptian  National 
Scientific  and  Technical  Information  Network  (ENS7INET)  stipulated 
that  major  end-user  facilities  of  the  system  should  be  Dt lingual 
in  English  and  Arabic. Many  characteristics  of  the  Arabic  alphabet 
and  language  impact  computer  applications,  and  there  exists  no 
universally  accepted  character  encoding  scheme  equivalent  to  the 
ASCII  standard  for  utln  alpnabets.In  oroer  to  overcome  the  native 
language  barrier  in  the  system,  a  native  language  Interface  to 
existing  software  was  developed. The  Arabic  language  software 
functions  include  an  Arabic  editor  runntng  oncer  the  UNIX 
operating  system,  an  Arabic  oatapase  search  facility,  ana 
electronic  mail,  which  were  implemented  for  peripheral  devices 
using  the  COOAR/UFO  Arabic  character  encoding  scheme, The  Arabic 
cataoase  search  facility  has  been  cevelocec  by  arabizlng  BRS/Mate, 
a  menu-oriented  front  end  to  the  native  noco  of  Mini-Micro 
BRS/Searcn,  a  full  text,  state-of-the-art  Information  management 
software  system, Five  references  are  provided. (MES). 
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Chinese  and  Japanese  Language  Translation  by  Computer. January 
1975-June  1987  (Citations  from  tne  1NSPE0  information  Services 
for  the  Pnyslcs  and  Engineering  Communities  Database), 
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This  bibliography  contains  citations  concerning  researcn  and 
development  of  computer  nardwara  and  software  for  the  language 
translation  of  Chinese  and  Japanese. Computer  technology  in 
character  recognition,  sentence  analysis,  text  input  and  output 
systems,  automatic  language  translation  systems  for  personal 
computers,  and  character  generation  and  analysis  zr# 
dlscussed.Translatlon  techniques  for  CMnese-to-Jaoanese, 
Cninese-to-Englisn,  and  Japanese-to*Englisn  are 
presented.ApplIcatlons  in  business,  utilities  management,  and 
library  automation  ar#  included  (Contains  60  citations  fully 
inooxed  and  including  a  title  list.) 
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a  method  for  generating  custom  self  timed  integrated  circuits 
(ICs)  from  algorithmic  descriptions  of  tne  desired  circuits. The 
goal  is  to  culckly  produce  prototype  integrated  circuit  masks  tiiat 
inclement  various  algorithms  and  data  types  in  order  to  evaluate 
the  IC  power,  celay.  ana  area  characteristics. A  topology  and 
behavior  preserving  mapping  Is  used  to  perform  the  translation 
from  constructs  in  the  function  language  to  mask 
primitives. Keywords;  Algor ithmsjlntegrated  circuits  masks;Self 
timed  integrated  circuitssHigh  level 
language ;Compllers;Translator;Aigol  68;Te«plates. 
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In  many  well-known  macnine  analysis  and  translation  systems  (We 
systems),  tne  valencies  of  tne  verbs,  adjectives  ana  nouns  aro 
used  implicitly  or  explicitly, The  degree  of  explicitness  Is  shown 
by  tne  purpose  for  wnten  valencies  aro  used  ana  wnat  value  they 
take  up  in  tne  theoretical  description. This  Is  a  report  on  an  Mue 
system  whlcn  has  selected,  implemented  and  tested  the  valency 
theory  as  tne  central  grammatical  theory. The  theroetlcal 
implications  of  the  practical  use  of  valencies  in  these  We 
systees  are  dealt  with  in  detail. An  wo  related  valency  theory  is 
propounded,  and  the  differences  bet-een  obligatory  and  facultative 
factors  ano  free  data  are  shown. The  case  theory  (notation 
variants,  cesigned  definition  of  case  rolls)  are  also  included  in 
tne  syntact leal-seaant 1c  representation. (MrfJ). (TIB.  RO  2852  (4)  ) 
(Copyright  (C)  1987  by  FZZ. Citation  no. 87 *080056.). 
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in  the  context  of  a  project  on  'electronic  speech  research',  the 
question  was  tackled  of  how  natural  speech  systems  developed  on 
small  sections  of  speech  benave,  if  they  are  confronted  by  large 
amounts  of  text  from  wide  areas  of  appl icat loo. This  problem  ooes 
not  occur  in  the  development  pnase  of  many  systems,  as  one  usually 
works  with  restrung  amounts  of  text. This  is  a  report  on  solving 
this  problem  by  a  system,  whlcn  makes  access  to  tne  machine  speech 
analysis  and  translation  systems  SUSY-II  and  SUSY-111  easier,  and 
whlcn  is  Intended  to  control  the  modification  of  tne  linguistic 
and  strategic  information  contained  in  tnem.Tne  background  to  the 
system  Is  sketched  and  reference  is  mace  to  theoretical  bases  of 
systeas. A  model  developed  from  this  is  introduced  (experimental 
generation  of  parts  of  tnis  mocel). Finally,  the  integration  of 
cata  bases  in  this  concept  Is  shown,  (MrfJM  TIB:  RO  28S2  (5).) 
(Cooyrlgnt  (c)  1987  by  FIZ. Citation  no. 87:080055.). 
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The  Lexical  translator  is  a  program  written  in  Turbo  PASCAL  to 
generate  a  Latin  PASCAL  source  code  from  an  Araoic  PASCAL  source 
ctwe.Tht  Arabic  code  Is  written  under  a  Bilingual  oceratlng  systea 
transparent  to  tie  DOS  on  personal  corouters. Tne  Bilingual 
operating  systea  cc*cat loll  tty  as  well  as  tne  AraDlc  cnaracters1 
code  values  is  investigated. The  Latin  code  is  fed  Into  a  cosouter 
to  Be  compiled  and  run  w!tn  Latin  interpreter  (!.•♦,  Turoo 
PASCAL) ►  in  an  Arabic  environment. (Author), 
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This  paper  exaalnes  tne  nature  of  generation  systems  today,  tne 
problems  tney  lave  teen,  designed  to  deal  witn.  tneir  strengtns  and 
tneir  weakness. Its  goals  to  give  tne  mt  community  a  sense  of  unit 
nas  Been  accompllsneo.  and  indirectly  to  snow  wnere  mt  researeners 
could  consider  adopting  or  adapting  some  of  tne  ai  work  Inis  work 
on  generation  need  not  Be  oone  By  AI  pcocie  alone  MT  can.  for 
example,  contribute  to  AI  research  on  tne  planrttng-level  oy 
sharpening  our  collective  understanding  of  tne  'carrying  capacity' 
of  tne  different  parts  of  a  language  tnrocgn  cross-language 
comparisons  tnat  try  to  fit  tne  ideas  carried  oy  tne  linguistic 
devices  of  a  source  language  into  tne  alternative  devices  of  a 
target  language. At  lower  levels,  MT  as  a  task  can  provide  more 
linguistically  demanding -sources  for  generation  tnan  most  any  of 
today's  expert  systems. At  tne  same  tiro  It  is  clear  that 
generation  is  oone  for  very  different  reasons  in  two  canos. Tne  AI 
context  Is  more  like  tnat  of  people  dealing  with  each  other  in 
normal  life— of  which  translation  Is  not  a  customary 
Dart, Nevertheless,  translation  is  a  normal  human  capacity,  and  a 
considered  comparison  of  the  generation  process  In  Doth  contexts 
should  toll  us  more  about  the  nature  of  generation  as  a  module 
within  the  human  mind  than  could  either  Dy  itself. 
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Tnis  paper  deals  with  the  problems  of  transfer  within  tne 
framework  of  **nacnlne**  ••translation**  systems. After  a  Brief 
general  discussion  of  tne  role  of  the  transfer  pnase  in 
••oaeh'ne**  ••translation**,  tne  authors  give  an  intuitive 
analysts  of  a  typical  lexical  transfer  problem  that  arises  in  the 
translation  of  a  snort  German  text  Into  Engllsn.In  the  light  of 
the  requirements  dtrived  from  tnat  example,  they  propose  a  systea 
of  multi-level  representations  for  source  and  target  texts  and  a 
corresponding  «ulti-level  transfer  pnase  for  the  MT  project 
KIT/NASEV, Tne< formal  Isms  for  tne  different  leveTa  of 
representation  are  illustrated  on  tne  oasis  of  tne  given  sample 
preolea  of  lexical  transfer. (Ccoyrignt  (c)  1986  By  FIZ. Citation 
no.86:80315.). 

92  04;  62  00 

Transl2tors*;Machlne'trAns1atlons*;Lexical  tr*nsfer*:Transferr ingts 
Automatic  language,  processing  “English  language: German  language. 
Linguistics 

Foreign  technology* :NTISFIZ;NTISFNG£;NTISLNGER 
N86-297  25/6/XA  D 

Travail  Cir.s  le  C*d'e  d'UM^Reseau' de  Teralnoiogie  en  Matiere  ce 
7ecnnologlt  de  I'Esoace 

Research  in  tho  Framework  of  a  Terminology  Network  in  too  Field  of 
Space  Technology. 

Text  to  French. Presented  at  Infotera  Symposium  (2nd), 

30UDJE0I0  ]i. 

Socfete  National*  industrleile  Aerosoatiale.  Paris  (France). 
National  Aeronautics  and  Space  Administration,  Washington,  OC 


066215000;  S0451674 

Conference 

f^E 

PR 

SNIAS-86I-550-I0U  ESA-86-97»78 
NP,  9:  OP,  1986. 

$2420 

NTIS  Prices:  PC  A02/MF  A01 

The  network  implemented  By  a  leading  aerospace  company  is 
descriBed.lt  is  based  on  a  terminological  and  linguistic 
coordination  extended  to  national  and  international  levets.  a 
terminological  data  Dank,  and  a  dictionary  puDUsning  Center. Tne 
integration  o'  these  activities  in  an  industrial  organization  is 
explained. Tne  role  played  by  automatic  translation  systems  Is 
discussed. 
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THALIA-3  is  a  new  **macnine**-**translat1on**  system  which  can  be 
operated  on  melCOm-COSMO  Series  computers. The  main  purpose  of  its 
developments  into  Engiisn  and  vice  versa  at  high  soeeds.lt  uses 
knowledge  information  tecnnoiogy  plus  semantic  representation  to 
meet  tnls  requirement  It  also  nas  a  Basic  60.000-term  dictionary, 
tne  tecnnical  vocabuiariy  of  wnicn  can  oe  extended,  and  covers  a 
wide  variety  of  tecnnical  fields, 
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Text  In  Japanese  witn  English  abstracts.  See  also  PB86-19859S 
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Tne  issue  contains  tecnnical  repor**  on:  a  home-use 
hlgn-oef inition  VCR;Software  V'  tne  development  of  home 
electrical  appllcances  and  home-r  .omatlcn  syste%s;A  verification 
system  for  logic  programs: thalia-3.  *  Japanese-Englisn 
•*ffacnine**“**translatlon*»  system  T-^wiotn  compression  of  video 
signals  by  means  of  vector  Quanilzat lonjCoepound  semiconductor 
superlattlce  hettrostructures;Recent  advances  in 
superconouct  Ing-.magnet  tecnnoiogy; The  development  of  a 
three-dimensional  CAO/CAM  »yste»;Tne  SO  a-  SAGE  hlgn-powor  C02 
laser  excitation  systematical  pickups  »or  comoact-disc 
players:Th.-ee  dioenslona’  device  technology;A  high-performance 
pnotomask  with  a  molybdenum  suicide  film; Ergonomics  In  industrial 
deslgn;Multlbeao  antennas {Magnetic  heads  and  media  for 
nign-density  disk  arives;A  hlgn-resolotlon.  high-quality 
thermal-printing  nead;A  l«o  dynamic  MOS  Ram. 
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Tne  following  toolcs  were  dealt  with:  information  cased 
pa$sing;d1sjunctlve  constraint  satisfactlonjhtad-ariven 
Bidirectional  passing; head-driven  parslng;prooaB!)isttc 
passlng;spetcn  reeognition;ccpen<£ncy  grammar  pas*lng:coeblnatory 
grammars:Toalta  algor  itneccomputatlonal  ccmp>ex1ty:connectionlst 
language  model; left-associative  gram»ar;f Inlte  state 
racnlnes:morpnol6cical  parser {Chart 
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Focuses  on  tn*  rime  system  aimed  at  tne  indexing  of  medical 
reports  in  a  multimedia  environmental*  particular  application  is 
viewed  to  Be  appropriate  for  a  large  set  of  needs  within  large 
user  con-unit )es:  domain  experts  dealing  w*tn  on-line  specialized 
documentation  suen  as  software  engineers,  medical  specialists  and 
so  on. In  this  application  textual  information  appears  as  an 
Interesting  reel*  to  accessing  related  pictures  in  the  data 
case. After  presenting  tne  application  ana  a  study  of  tne 
particular  corpus  involved,  tne  authors  define  a  semantic  model 
for  tne  documents  Based  on  a  conceptual  language. Tney  detail  tne 
inoexlng  process  ana  its  various  linguistic  components,  essentia) 
for  tne  translation  of  medical  reports 
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LMT  (logic-based  machine  translation)  Is  an  experimental 
English-to-German  MT  system,  being  developed  in  tn#  framework  of 
logic  programming, Tne  English  analysis  uses  a  logic  gramar 
formalise,  modular  logic  grammar,  which  allows  logic  grammars  to 
oe  more  compact,  and  which  nas  a  modular  treatment  of  syntax, 
lexicon,  ana  semantics. The  English  grammar  is  written 
incependently  of  tne  task  of  transiatlon.UfT  uses  a 
syntax-to-syntax  transfer  method  for  translation,  although  tno 
Cnglisn  syntactic  analysis  trees  contain  some  results  of  semantic 
choices  and  snow  deep  grammatical  relations .Semantic  type  checking 
with  Prolog  inference  is  cone  curing  analysis  and  transfer  Tne 
transfer  algorithm  uses  logical  variables  and  unification  to  good 
advantage:transf*r  works  in  a  simple  left-to-rignt,  top-down 
way. After  transfer,  the  German  syntactic  generation  component 
produces  a  surface  structure  tree  by  application  cf  a  system  of 
tree  transformatlons.Tnese  transformat  ions  use  an  augmentation  of 
Prolog  pattern  natcnlng.LUT  has  a  single  lexicon,  containing  noth 
source  and  transfer  information,  as  well  as  some  idiosyncratic 
target  morphological  Informal  ion. Tnore  is  a  compact  external 
format  for  this  lexicon,  with  a  lexical  preprocessing  system  that 
applies  defaults  and  compiles  it  into  an  internal  format 
convenient  for  tne  syntactic  components. During  lexical 
preprocessing,  English  morphological  analysis  can  Be  coupled  with 
rules  that  synthesize  new  transfer  entrtes 
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The  article  discus'**  tn«  use  of  the  interactive  tool  $PR$  in  tne 
••computer' ♦-••alceo**  ••translation**  of  STI  texts. This 
semi-automatic  translation  system  is  described  in  detail,  Its 
fundamental  properties  Being  presented. English  language  is 
required  as  the  input  and  Czech  language  is  produced  as  tne 
output. SPPS's  output  requirements  are  discussed  from  the  general 
linguistic  point  of  view,  along  with  the  inclement*! ion  software 
and  corresponding  data  structures. Debugging  of  the  system  Is 
described. Further  developments  of  SPPS,  along  with  its  technical 
demands  and  applications,  arc  presented 
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Designers  of  automatic  Information  processing  systems  are 
increasingly  concerned  with  providing  a  formal  conceptual 
representation  of  tne  problem  universe. For  systems  operating  with 
natural  language  texts,  this  means  extracting  from  tne  natural 
language  its  conceptual  content  and  endowing  tne  content  with  a 
fora,  caking  It  susceptible  to  computer  Input  ana  processing 
according  to  a  desired  order. Tne  author  describes  tn#  results  of  a 
study  of  tn*  semantic  structure  of  English  se«s  ‘exts.  based  on 
a  rbdel  of  distributional  semantic  classes  (OSC)  obtained  with  the 
SIMPAR-SMIT  software  package  for  the  avesta  national  databank. Tne 
model  ano  tne  analysis  procedures  are  based  on  a  novel  application 
of  tn*  concept  Of  distributional  semantic  analysis 
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Reports  on  analysis  of  Japanese  sentences  conducted  to  cetermtn* 
tneir  Qualitative  and  Quantitative  cnaracterist ics  wnieh  are 
considered  to  contribute  to  relative  complexity  of  tn*  .  -panes# 
language. Twenty-six  textbooks  used  for  classes  of  Japanese 
language,  matnematics,  science,  and  social  study  at  elementary 
schools  (specifically  for  secood  and  fifth  graders).  Junior  hlgn 
schools,  and  high  schools  were  sampled  to  Obtain  some  2700 
sentences  for  exa'ai nation, The  objects  of  quantitative  analysis 
included  th#  length  of  sentences,  the  number  of  verbs,  adjectives, 
and. adjectival  verbs,  and  tne  numbers  of  modifying  phrases  and 
parallel  structures. Tne  qualitative  examination  encomoar-sed 
hononyas.  morphological  ambiguities,  sentence  styles,  parallel 
structures,  ellipses,  and  anaphora.Flve  sentences  were  cnosen  from 
16  textbooks  (one  each  grade,  one  eacn  suoJect).Thes#  selected 
sentences  Identified  as  those  having  tne  average  cnaracterist ics 
are  expected  to  serve  as  data  against  which  present  and  future 
cacnine  translation  systems  are  evaluated 
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The  author  is  interested  in  formalisms  which  are  oelng  used  or 
nave  applications  in  the  domain  of  machine  translation  (MT).His 
Interest  lies  mainly  in  their  role  In  the  Domain  in  terms  of  the 
case  in  expressing  linguistic  knoalooge  required  for  UT,  as  well 
as  the  case  of  implementation  in  mt  systems. He  begins  by 
discussing  formalises  within  the  general  context  of  mt,  clearly 
separating  the  role  of  linguistic  formalisms  on  one  end,  which  are 
more  apt  for  expression  linguistic  knowledge,  and  on  the  other, 
the  SLLPS  which  are  specifically  designed  for  mt  systems. He  argues 
for  another  type  of  formalism,  the  general  formalism,  to  bridge 
the  gap  between  the  two. Next  ho  discusses  the  role  of  formalisms 
In  analysis  and  in  generation,  and  then  more  specific  to  MT,  in 
synthesis. He  sums  up  with  tne  building  of  a.  compiler  that 
generates  a  synthesis  program  in  SLLP  from  a  set  of  specif icat ions 
written  in  a  general  formalism 
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The  author  demonstrates  that  the  enriched  theoretical  vocabulary 
of  situation  semantics  offers  a  mere  intuitive  characterisation  of 
the  translation  process,  than  was  pcssiolo  using  more  traditional 
semantic  theories. This  demonstration  takes  the  fora  of  a 
formalisation  of  the  most  commonly  used  method  for  mt  In  terms  of 
situation  semantic  constructs.He  considers  wnat  the  theory  of 
situation  semantics  nas  to  offer  to  an  mt  application. The  paper 
consists  of  a  Dasic  introduction  to  tne  machinery  of  situation 
semantics,  an  examination  of  the  proole®  of  translation,  a  formal 
description  of  a  transfer-based  mt  system  and  some  examples  of  the 
kind  of  lexical  transfer  one  would  expect  to  define  in  such  a 
system 
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Theoretical  research  In  the  area  of  macnint  translation  usually 
involves  tne  searen  for  and  creation  of  an  appropriate 
formal  Isa. An  Important  issue  in  tnis  respect  is  the  way  In  wnicn 
the  compositlonality  of  translation  is  to  be  defined. The  authors 
introduce  tne  anaphoric  component  of  the  Mino  foroallsa.lt  makes 
the  definition  and  translation  of  anaphoric  relations  possible, 
relations  wnicn  are  usually  problematic  for  system*  that  acnere  to 
strict  cotposit locality. In  mi mo,  tne  translation  of  anapnoric 
relations  Is  coopositional.The  anaphoric  component  is  used  to 
define  linguistic  phenomena  suen  as  wn-movement,  tne  passive  and 
tne  binding  of  reflexives  and  pronouns  monoMngually.Tne  actual 
worxfng  of  tne  component  is  shown  py  means  of  a  detailed 
discussion  of  wn-move*ent 
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The  authors  describe  a  framework  for  research  into  translation 
that  cra«s  on  a  combination  of  two  existing  and  independently 
constructed  technologies;  an  analysis  component  developed  for 
Gorsan  by  tne  EUROTRA-O  group  of  IAI  and  the  generation  component 
developed  for  English  by  tne  Penman  group  at  ISI.They  present  some 
of  the  linguistic  implications  of  the  research  and  the  promise  it 
bears  for  furthering  understanding  of  tne  translation  process 
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The  authors  sketch  and  illustrate  an  app-oacn  to  machine 

translation  that  exploits  the  potential  of  simultaneous 

correspondences  between  separate  levels  of  linguistic 

representation,  as  formalized  in  the  LFG  notion  of 

codcscriptlons.Tne  approach  is  Illustrated  with  examples  fro« 

English,  German  and  French  whore  the  source  and  tne  target 

language  sentence  snow  noteworthy  differences  in  linguistic 

analysis 
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A  descriptive  framework  for  translating  speaker's  oeanlng-towaras 
a  dialogue  translation  system  between  Japanese  and  English 
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A  framework  for  translating  speaker's  meaning  or  intention  is 
proposed  based  on  two  notions  II locutionary  force  types  (IPTs)  for 
analysis  and  decision  parameters  (OPs)  for  generatlon.IFTs  are  a 
certain  kind  of  classif icat ion  of  utterances  concerning  speaker's 
meaning. DPs  present  background  information  of  language  use  in 
orcer  to  derive  an  appropriate  expression  from  speaker's 
meaning, In  Japanese,  IFT's  can  be  derived  automatically  through 
syntactical  constralnts.To  generate  appropriate  expressions, 
language-spec if ic  communication  strategies  related  to  OP  values 
should  be  given  a  prjori.The  whole  process  is  performed  In  a 
unification-based  framework 
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Presents  an  algorithm  for  incremental  chart  parsing,  out  liras  now 
this  could  at  embedded  In  an  interactive  parsing  syste®,  and 
discussas  why  this  nlgnt  be  useful. Incremental  parsing  Mans  that 
Input  Is  analysed  in  a  placental  fashion,  In  particular  allowing 
arbitrary  changes  of  previous  Input  without  exhaustive 
reanalysis. Interactive  parsing  naans  that  the  analysis  process  is 
prompted  immediately  at  the  onset  of  new  Input,  and  possibly  that 
the  system*  then  nay  Interact  with  the  user  In  order  to  resolve 
problems  that  occur. The  combination  of  tnes*  techniques  could  be 
used  as  a  parsing  kernel  for  highly  interactive  and  ‘reactive' 
natural -language  processors,  such  as  parsers  for  dialogue  systems, 
interactive  ••cc®cuter»*-**aiaed**  ••translation**  systems,  and 
language-sensitive  text  editors. An  incremental  chart  parser 
embodying  the  ideas  put  forward  has  been  implemented,  and  an 
eaoeddlng  of  this  in  an  interactive  parsing  system  is  near 
completion 
C6180N;  C4210 

computational  linguist icssgramars; interactive  syste»s;niturat 
languages 

incremental  chart  parsings  Interactive  parsing  systeaspleceseal 
fasnson;oarsing  kernel (natural  language  processors;dla1ogue 
systems (Ccmputer  aided  translation  systeos, language  sensitive  text 
editors 


CS9057055 

Tne  organization  of  the  Rosetta  grammars 
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The  organization  of  the  grammars  in  tne  Rosetta  machine 
translation  system  Is  described  ana  It  is  snow n  now  this 
organization  makes  It  possible  to  translate  between  words  of 
different  syntactic  categories  in  a  systematic  way. It  is  also 
snown  now  tne  organization  chosen  makes  it  possible  to  translate 
snail  clauses  into  full  clauses  ana  vice  versa, Tne  central  concept 
worked  out  here  in  seme  detail  is  the  concept  of  partial  isomorphy 
between  subgrasnars.The  syste®  as  described  has  been  implemented 
and  is  being  tested 
C6180N;  C4210;  C7820 
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Ambiguity  resolution  in  the  omtrans  PLUS 

Fourth  Conference  of  the  European  Chapter  of  tn*  Association  for 
Computational  Linguistics. Proceedings  of  the  Conference 
Manchester.  UK 
10-12  April  1989 

KITANO  H. ;  T0MA8ECHI  H.(  LEVIN  L. 

Carnegie  Mellon  Untv.,  Pittsburgh,  pa.  USA 

Conference  paper 

Practical 

ENG 

US 

Assoc.Ccoput .Linguist les (Morristown,  NJ.  USA 
KP.  xxv*326;  PP.  72-9;  28*Ref.;  OP.  1989 
The  authors  present  a  cost-based  (or  energy-cased)  nocel  of 
disambigue! lon.wnen  a  sentence  is  ambiguous,  a  parse  witn  the 
least  cost  Is  enosen  from  among  multiple  nypotneses.Eaen 
nypothcsls  is  assigned  a  cost  which  is  added  wn«ns  (1)  a  new 
instance  is  created  to  satisfy  reference  success,  (2)  links 
between  instances  are  created  or  removed  to  satisfy  constraints  on 
concept  sequences,  ana  (3)  a  concept  node  with  Insufficient 
priai;.g  is  used  for  further  processing. This  method  of  ambiguity 
resolution  Is  implemented  in  omtrans  PLUS,  which  Is  a  second 
generation  ol-directlonai  Engl ish/ Japanese  machine  translation 
systea  based  on  a  massively  parallel  spreading  activation  paradigm 
C6180N;  C7820 
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natural  language  understanding;direct  memory  access (disambiguation; 
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Fourth  Conference  of  tne  European  Chapter  of  the  Association  for 

Computational  Linguistics. Proceedings  of  the  Conference 

Fourth  Conference  of  the  European  Chapter  of  the  Association  for 

Computational  Linguist ics.Proceedlrgs  of  the  Conference 

Manchester,  UK 

10-12  April  1989 

Conference  proceedings 

Practical 

ENG 

ZZ 

Assoc.Comput. Linguist les (Morristown.  NJ.  USA 
NP,  xxv*326;  DP.  1989 

Tne  following  topics  were  dealt  with:  computational  lexical 
semantics; pars ing;gramars -.natural  language  proc«ss1ng;expert 
systems (know ledge  rearesent at  Ion; logic  prog^amralngstext-to-speecn 


systems; intelligent  tutors (knowledge  acQulslt!on:dlscours« 
represent  at  ion  (anapnora  resolutlon-.unif  1  cat  ion  grammars  (and 
machine  translation 
C6I80N;  C4210;  C7820;  C6170 
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Textual  and  computational  linguistics 
FERRARI  G. 

Journal  paper 
Practical 
ITA 
ZZ 

Slst.Impresa  (Itaiy)tSlstftnt  &  lenpresa 

VOL.  35 (  NO.  302;  PP.  673-9;  24  Ref.;  OP.  April  1989 

Reviews  some  of  the  major  areas  of  computational  linguistics  and 

text  processing. These  include  statistical  linguistics, 

concordances  and  lemmatlzatton  (including  some  machine  translation 

aspects),  machine  dictionaries,  morphological  analysis,  text 

comprehension  ana  generation,  and  stylistic  analysis 

C6180N;  C7820:  C61300 
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The  potentlarof  Swetra-a  multilanguage  mt  systea 
SIGURD  8.;  CAWRONSKA  WERNGREN  8. 

Deot.of  Linguistics  t  Phonetics,  Lund  Untv.,  Sweden 

Journal  paper 

Practical 
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Cceout .Trans! .(USA) (Computers  md  Translation 

VOL.  3:  NO.  3-4;  pp,  237-50:  11  Ref,:  OP.  1988-1989 

C0MTE5 

0884-0709 

Swetra  is  a  multilanguage  MT  systea  defined  by  the  potentials  of  a 
formal  gramar  (stancard  referent  grawsar)  and  not  by  reference  to 
a  genre. Successful  translation  of  sentences  can  be  guaranteed  If 
they  are  within  a  specified  syntactic  format  cased  on  a  specified 
lexicon. Tne  authors  discuss  the  consequences  of  this  approach 
(gra-matlcaliy  restricted  machine  translation.  GRMT )  and  describe 
the  Halts  set  by  a  standard  choice  of  grammatical  rules  for 
sentences  and  clauses,  noun  onrases.  verb  onrases,  sentence 
adverbial s,  etc. Such  rules  have  been  set  up  for  English.  Swedish 
ana  Russian,  mainly  on  tne  basis  of  familiarity  (frequency)  and 
computer  efficiency. However,  restricting  the  grammar  and  making  it 
suitable  for  several  languages  poses  many  problems  for 
optimization. Sample  texts-newspaper  reports-tliustrate  the  type  of 
text  that  can  be  translated  with  reasonable  success  anoog  Russian. 
English  and  Swedlsn 
C7820;  C4210;  C4190 
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Automatic  computer  recognition  of  German  wore  types 

XIE  JIN8A0;  SUN  J1EMING;  WANG  JIAN 

Journal  paper 

Theoretical  mathematical 

CHI 

ZZ 

J. Shanghai  Jlaotong  Univ. (China), Journal  of  Shanghai  Jtaotong 
University 

VOL.  23:  NO.  I:  PP,  70-8;  5  Ref.;  OP,  1989 

SCTPOrt 

0253-9942 

wore  type  recognition  Is  the  basis  of  natural  language 
understanding  and  analysis. The  autnors  give  a  description  of  the 
possibility  of  German  word  type  recognition  by  computer  according 
to  tne  theory  of  pattern  ano  pattern  matching  in  SN080L  as  well  as 
the  flexibility  of  morphology  in  German. The  software  employed  in 
German  word  type  recognition  has  an  accuracy  better  of  over  95* 
C6180N 
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Language  and  meaning 
NX GAO  M. 

Fac.of  Eng..  Kyoto  univ.,  Japan 

Journal  paper 

Tneoretlcal  mathematical 

JAP 

JP 

J, Inst. Electron. ir.f.Commuh, Eng. (Japan), Journal  of  the  Institute  of 


Electronics,  Information  and  Communication  Engineers 
VOL.  71s  NO.  in  PP.  1157-62;  3  Ref.:  OP.  Nov.  1988 
DJTGEB 
0913-5693 

The  relationship  Between  language  and  its  meaning  is  described, 
methods  of  aefinlng  meaning  art  outlined  from  t he  standpoint  of 
ccmoutationai  linguistics,  and  tne  role  of  language  in  symbols  ana 
images  is  alscussea.ine  topical  areas  inclucei  (i)  meaning  of 
word;(2)  meaning  of  paragraph; (3)  meaning  of  sentence^*)  meaning 
In  translationjana  (5)  role  of  language  in  meaning 
C6180N;  C4290 
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European  Community  policy  on  MT 

mt  Machine  Translation  Summit. Manuscripts  ano  Program 

Hakone,  Kanagawa-ken.  Japan 

17-19  Sept.  1987 

ROLLING  L. 

Conference  paper 
General 
ENG 
2  Z 

Toshiba  Corps Kawasaki,  Japan 

NP.  159;  PP,  97-8;  0  Ref.;  OP,  1987 

One  of  tne  roles  of  tne  EC  Commission  Is  to  help  the  European 
Community  to  overcome  tne  language  carriers  tnat  are  presently 
hampering  Its  cultural  ano  economic  unification. The  Commission  Is 
supporting  basic  researcn,  developing  new  tools  ano  resources  and 
promoting  their  implementation  through  the  creation  of  compatible, 
user-frienaly  Infrastructures  .On  the  research  sloe,  the  EUROTRa 
programme  aims  at  supplying  not  only  with  a  modular  MT  system 
covering  all  European  languages,  but  also  with  a  valuable  test  bed 
for  further  research  in  computational  linguist ics. In  the  framework 
of  the  ESPRIT  programme,  several  projects  are  aimed  at  the 
integration  of  voice  recognition  devices  in  industrial  eaulpment. 
but  tne  main  research  project  is  one  tnat  has  undertaken  a 
thorough  analysis  of  sewn  European  languages  with  a  view  to 
creating  reliable  lexica)  resources  for  use  of  both  text  and 
speech  translat ion. The  Commission  also  contributes  to  tne 
development  of  standards  for  linguistic  tools  and  resources, 
including  lexical  data  banks,  text  and  speecn  corpora  and 
multilingual  thesauri. Another  major  effort  to  the  Commission  has 
been  its  contribution  to  the  development  of  the  SYSTRAN  system  for 
a  number  of  European  language  pairs 
C0230;  C7820 
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Prospects  in  machine  translation 

MT  Macnine  Translation  Sums  It. Manuscripts  and  Program 

Ha k one,  Kanagawa-ken,  Japan 

17-19  Sept.  1987 

HUTCHINS  W.  J, 

Univ.of  East  Anglia,  Norwich,  UK 

Conference  paper 

General 

ENG 
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Toshiba  Corp(Xawasaki,  Japan 
NP.  159;  PP.  48-52;  0  Ref.;  OP.  1987 
Reviews  tne  state  of  the  art  In  MT  systems,  noting  that  no 
operational  systems  can  produce  good  Quality  output  without 
placing  restrictions  on  input  texts  or  involving  nuaan 
assistance. mt  systems  oncer  development  are  based  on  tne 
syntax-oriented  approacn  of  computational  linguist ics. AI 
approaches  offer  the  scope  for  considerable  ioorovements.Tne  cest 
prospects  for  future  fully  automated  translation  systems  will  pe 
those  combining  traditional  linguistics  approacnes  and  knowledge 
based  approaches 
C7820;  C6170;  C018ON 
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Governmental  views  of  MT 

mt  Macnine  Translation  Summit, Manuscripts  and  Program 
Hakone,  Kanagawa-ken,  Japan 
17-19  Sept.  1987 
CZERMAK  J.  M. 

Conference  paper 

General 

ENG 

ZZ 

Toshiba  Corp;Kawasaki,  Japan 
NP.  159;  PP,  38-9;  0  Ref.;  OP.  1907 

Language  will  play  an  important  role  In  future  RAO  work. It  Is 
amenable  to  computer  representation, language  RAO  still  leaves  many 
Questions  unresolvod.Translatlon  is  one  of  tnose.The  computer  will 
contribute  to  the  resolution  of  these  quest  ions, Tne  discipline  of 
computational  linguistics  cay  not  be  sufficiently 
mature. computational  linguistics  will  Deco&o  a  major  field  of 


international  RAO  cooperation 
C0230;  C7820 
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••Coraouter**  ••aided**  ••translation**  system  and  computerized 
dictionary 

mt  Machine  Translation  Surratt. Manuscripts  and  Program 
Hakone,  Kanagawa-ken,  Japan 
17-19  StPt.  1987 
YAMAOKA  y. 

Conference  paper 

Practical 
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Tosniba  Corp; Kawasaki,  Japan 

NR.  159;  PP.  141-2;  0  Rtf  ;  0?.  1907 

ISS  Inc.,  are  engaged  in  language  services  Including  translation 
services  and  are  one  of  the  leading  comoanleb  in  Japan  in  this 
field. Their  customers  require  the  highest  possible  duality  of 
translation  from  English  to  Japanese  for  their  sales  materials, 
manuals,  catalogues  and  so  on. ISS  have  therefor#  introduced  a 
••comouttr**  ••aided**  "translat  ion**  system  developed  by 
Tbfilba.Tney  nave  been  utilizing  and  Improving  this  system  to 
assist  translation  activities  conducted  oy  experienced  snd 
professional  translators,  mainly  for  the  following  purposes- 
standardisation  and  unification  of  terminology  and  customization 
of  terminology 
C7820 
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Fujitsu  Macn're  translation  system 

MT  Machine  Translation  Summit. Manuscripts  and  Program 

Hakone,  Kanagawa-ken.  japan 

17-19  Seat.  1987 

UCHIOA  H. 

Fujitsu  lacs. Ltd.,  Japan 
Conference  paper 
Practical;  Product  reviews 
ENG 
JP 

Tosniba  Corp;Xawasakt,  Japan 

HP.  159:  PP.  129-34,-  0  Ref,;  OP.  1987 

Due  to  the  rapid  advancement  of  both  comoutor  technology  and 
linguistic  theory,  macnine  translation  systems  are  coring  into 
practical  use. Fujitsu  nas  two  macnine  translation  systems. ATLAS-I 
is  a  syntax-oaseo  macnine  translation  system. ATLAS  II  is  a 
semantic-based  system  wnlcn  aims  at  nign  Quality  multilingual 
translation.!/!®  ATLAS  II  translation  mechanism  is  explained;1t 
involves  analysis,  transfer  and  generation  processes 
C7820;  C6J300 
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MT  Machine  Translation  Summit .Manuscripts  me  Program 
Hakone,  Kanagawa-ken,  Japan 
17-19  Sept.  1987 
SUOARWO  l. 

Conference  paper 

Practical 
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ZZ 

Toshiba  Corp;Kawasaki,  Japan 
NP,  159;  PP.  113;  ORef.;  OP,  1987 

Summary  form  only  given.'iacnine  translation  technology  will  bring 
new  ooportunities  for  Inconesia.An  Indonesian  agency  is  therefore 
conducting  research  and  developing  a  prototype  English- Indonesian 
••Computer**  ••aided**  •♦translation**  system. Involvement  in  a 
Japanese  RAD  project  to  develop  multilanguage  macnine  translation 
is  also  outlined 
C7820;  C0230 
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Kana-to-KanJ 1  translation  based  on  collocational  analysis  for 
noo- segmented  input 
YAUASHINA  M«;  09ASHI  F. 

Human  interface  Laps.,  NTT,  Tokyo,  Japan 

Journal  paper 
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Rev, £ I ectr.Comrun.Lab. (Japan) {Review  of  the  Electrical 
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VOL.  37:  HO.  Is  pp.  65-70;  13  R#f.;  OP.  Jan.  1989 

RELTAN 

0029-067X 

Proposts  a  new  dlsamoiguatlon  method  for  Kana-to-Kanj l 
tranalat Jon. Thu  method  evaluates  candidate  sentences  oy  measuring 
the  nyaoer  of  word  cooccurrence  patterns  (wCP)  Included  In  the 
candidate  sentences. An  automatic  WCP  extraction  method  Is  also 
developed  and  apout  305000  sets  of  wcp  are  collected  from  exanole 
sentences  in  dictionaries  Dy  this  method. Using  a  wCP  matrix 
organized  oy  semantic  category,  the  mean  number  of  candidate 
sentences  in  Kana-to*Xanjl  translation  Is  reduced  to  about  1/10  of 
those  produced  oy  existing  morphological  methods,  and  results  In  a 
translation  accuracy  of  95% 
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Intercoc  ( Indexing  retrieval  aid) 

JOSCELYNE  A. 

Journal  paper 
Practical 
RUS 
22 

Lang.Tecnnol . (setheriands);tanguage  Technology 
NO.  il{  PP.  28-31:  0  Ref.:  OP.  Jan.-Feo.  1989 
LANTEB 

The  paper  discusses  the  sophisticated  documentation  tool  Interdoc, 
an  indexing-retrieval  ennancement  to  CAT  <**comput#r**  ••aided** 
••translation**). Whereas  CAT  was  primarily  aimed  at  the 
dictionary-using  translator.  Intcrcoc  is  specifically  designed  to 
De  used  oy  a  wide  range  of  professionals,  including  corporate 
tnoexers  and  target-language  end  userSrTne  core  idea  Is  that  of 
the  corporate  knowledge  system 
C7250;  C7240 
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A  mathematical  model  for  translations  of  natural  languages 
KATZ  E.s  LEIFUAN  l.  J. ;  MARTY  R.  H.;  ROBINSON  S.  M. 

Oept.of  Math.,  Cleveland  State  UMv„,  OH,  USA 

Journal  paper 

Tneorotical  mathematical 

ENG 

US 

Inf.Scl.(USA):Informatlon  Sciences 

VOL.  47, •  NO.  IS  PP.  35-45;  10  Ref*;  OP.  Feb.  1989 

ISIJBC 

0020-0255 

0020-0255/89/ 103,50 

Several  mathematical  models  have  ceen  introduced  in  linguistics 
and  in  translations  of  languages.The  essential  mathematical  tools 
used  have  been  algebra,  probability,  logic,  etc. The  authors 
introduce  a  topological  model  for  languages  and  their 
translatlons.Uslng  this  model,  they  prove  that  every  text  In  a 
major  language  has  a  best  approximation  text  In  any  other  major 
language. iney  prove,  similarly,  that  every  text  In  any  major 
language  has  a  best  approximation  within  the  language  itself. This 
permits  the  theoretical  possibility  of  automatic  translations 
C4290J  C702O 
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CATEC-a  '‘computer**  ••atdec**  ''translation**  of  English  to 
Chinese  system 

1988  International  Conference  on  Computer  Processing  of  Chinese 

and  Oriental  Languages. Proceedings 

Toronto,  Ont.,  Canada 

29  Aug. -I  Sect.  1988 

TOO  J.  T. 

Center  for  Inf. Res..  Florida  Univ„,  Gainesville,  Ft,  USA 
Chinese  Language  Comput.Soc. {Chinese  Canadian  Inf. Processing 
Prof ess Jonal s;Phi tips  Electron 
Conference  paper 
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US 

Concordia  Unlv;Montreal,  Oue..  Canada 
NP.  xvl 1*645:  PP,  47S-9:  IS  Ref.;  OP.  1988 

Represents  a  new  system  for  ♦•computer ••-•♦aided**  ''translation** 
Of  technical  and  scientific  publications  In  English  into  Chinese 
language-Tnis  system  is  based  upon  the  innovative  idea  of 
linguistic  canonical  transformation  in  order  to  incorporate  the 
cultural  aspects  of  a  natural  language.via  paraphrasing  by 
computer,  the  ccmputer,  the  messages  and  information  contained  in 
a  complex  sentence  or  a  set  of  sentences  are  expressed  In  terms  of 
several  simple  sentencts.By  making  use  of  a  know  I  edge- base  of 
skilled  translators'  expertise.  *hase  sentences  are  converted  to 
'Chinese  English'  sentences  wnich  are  referred  to  as  linguistic 
canonical  forms. 7he  Chinese  English  sentences  are  then  translated 
into  Chinese  text 
C7620;  C4210;  C6I70 
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The  Saarbrucken  Translation  Service  $7$-*'computer**-**alctd** 
••translation**  for  specialised  information  centers 
LUCKMARDT  H.  0.;  Z1MMERMANN  H.  H. 
unlv.ats  Saarlanoes.  Saarbrucken,  west  Germany 
journal  paper 
Practical 
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NADOAW 
0027-7436 

OC27-7436/88/0612-035IS02.50/0 

The  paper  presents  the  Saarbrucken  **Computer**-**Aided** 
••Translation**  Service  (STS)  being  developed  in  the  project  MARIS 
(Multilingual  Application  of  Reference-Oriented  Information 
Systems)  at  the  Information  Science  Department  of  tne  University 
of  Saarbrucken. Intellectual  and  machine  translation  (esp. German  to 
English)  are  combined  in  a  joint  system  surrounding  (translator's 
work$tat1on),MARis  applies  methods  and  (sub)systems  developed  for 
machine  translation  to  titles,  abstracts,  and  descriptors  from 
German  da upases, About  2  million  words  have  been  translated  so 
far. The  maris  project  is  funcea  by  the  Federal  Ministry  of  Science 
and  Technology 
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Tnls  paper  presents  a  sophisticated  method  for  an 
Engllsh-Indonesian  machine  translation  system  called  EICATS 
(Engl isn-inconesian  ••computer**  **alded**  ''translation** 
system). In  genera),  a  machine  translation  system  consists  of  tnree 
main  processes,  namely  analysis,  transfer  and  generatlon.Depcndlng 
on  the  method  that  is  used  In  the  transfer  level,  macnlne 
translation  systems  can  be  classified  Into  four  metnoos: 
syntactical  direct,  transfer,  integration  and  the  inter) ingua  or 
pivot  method. In  EICATS,  the  analysis,  transfer  ana  generation 
processes  are  not  nanoled  as  independent  processes,  but  are 
integrated. Consequently,  the  translation  process  is  cone  in  real 
time,  approaching  the  behaviour  of  a  human  translator  model 
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R.Conen  (Computational  LH>gv1st1cs,  vol.13,  no.*-2,  p. 11-24,  1987) 
has  described  a  mocel  for  tne  analysls-0f  arguments  that  includes: 
(1)  a  theory  of  expected  coherent  structure,  which  is  used  to 
limit  analysis  to  the  reconstruction  of  particular  transmission 
forms;(2)  a  theory  of  linguistic  clues  which  assigns  a  functional 
interpretation  to  special  woros  and  phrases  used  by  the  speaker  to 
indicate  the  structure  of  the  argument ;and  (3)  a  theory  of 
evidence  relationships  which  induces  the  demand  for  pragmatic 
analysis  to  accommodate  beliefs  not  currently  held. Tne  author 
summarizes  the  prescript  ions  for  coherent  analysis,  with  a  view  to 
tneir  application  in  the  translation  of  technical  material 
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The  oook  presents  a  description  of  tne  analysis  of  English  in  the 
framework  of  machine  translation  experimentation  carried  out  oy 
tne  linguistic  group  at  Charles  University  in  Prague.The  project 
in  question,  called  APAC2,  represents  the  second  experiment  in  a 
series  of  three. The  book  covers  formal  representation,  program 
structure,  morphemic  analysis  and  dictionaries,  the  noun  syntax, 
and  the  v#rp 
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One  of  the  most  imoortant  problems  facing  the  transfer  of 
microcomputing  technology  is  that  most  of  the  software  is 
developed  fer  English  (anguage  speaking  communities. This  is  mainly 
because  tn#  haraware  is  'Lit  in'  based  and  most  of  the  programming 
languages  art  pljke  English'  languages. Ho* ever,  technological 
developments  have  resulted  in  the  production  of  multilingual 
microcomputer  hardware. Concentration  now  is  in  the  development  of 
'foreign*  software. Tne  author  p-esents  the  initial  results  of  a 
research  project  oealJng  with  transforming  software  written  for 
English  language  users  to  Arabic  language  users  with  multilingual 
hardware. A  computer  program  is  prossnted  which  may  facilitate  the 
transformation  process  along  with  a  sample  application  for  a 
tnnsformeo  project  manwgeoent  program 
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Building  on  a  method  of  compressing  lexical  information,  the 
authors  have  cefined  a  set  of  algorithms  providing  the  minimum 
amount  of  information  necessary  to  generate  all  forms  in  the 
German  lexicon  and  to  detect  spelling  errors. Master  rams  were 
marked  for  part  of  speech  and  desinences  with  a  view  to  also 
allowing  possible  inference  drills  for  use  in  foreign  language 
instruction 
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Tne  author's  Min  concern  is  raising  Questions  rather  than  giving 


answers. Mis  starting  po*nt  is  Hans  Usvorelt's  revised  version  of 
the  LP  (linear  precedence)  component  (cf .Uszkorelt  1984  ana  1986) 
within  the  GPSG  formalism  (cf.Gazaar  et  al. IS85). Ho  discusses  some 
problems  of  Uszkorett**  approach  that  result  from  the  fact  that 
the  whole  complex  phenomenon  of  German  word  order  is  described  at 
a  urlQu#  level  of  linguistic  represents! ion, He  then  proposes  a 
somewhat  speculative  solution  to  some  of  these  problems,  which  is 
cased  on  a  multi-level  approach  to  analysis  and  generation  within 
the  context  of  machine  translation 
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Oescripes  an  experiment  to  investigate  the  characterisation  of 

Japanese  morpno-syntax  within  a  lexical  1st  framework.lt  forms  part 

of  a  study  into  English  and  Japanese  gramars  from  the  oarochlal, 

contrastive  and  universal  viewpoints,  which  is  intended  to  support 

the  implementation  of  machine  translation  systems  between  the  two 

languages 
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The  following  topics  were  dealt  with;  separating  linguistic 
analyses  from  linguistic  theorie$;appllcablHty  of  indexed 
grammars  to  natural  languages {natural  language  toolkit {extension 
of  LR-parsing  for  lexical  functional  grammar jefficiency-oriented 
LFG  parser, parsing  with  a  GB-grammar {Combining  categorial  grammar 
ana  unlf teat  ion; feature-based  categorial  morpho-syntax  for 
Japanese;treatment  of  the  French  adjectlf  cotache  in  lexical 
functional  grammar; problems  of  coordination  in  German;German  word 
order  and  universal  grammar {nonlocal -dependencies  and  infinitival 
constructions  in  GermanjGPSG  ana  German  word  order snested  Cooper 
storage:  proper  treatment  of  auanti Meat  ion  in  ordinary  noun 
pnrases:and  compositional  semantics  for  LFG. Abstracts  of 
individual  papers  can  be  found  under  tne  relevant  classification 
codes  in  this  or  other  issues 
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The  French  national  project  of  **computer**-**alced** 
••translation**  (Traduction  Assistee  par  Ordinatcur,  TAO)  has  led 
to  the  Implementation  of  a  production  system  called  CALLIO’E-AERO 
using  tne  software  tool  ARIANE.Tnfs  system  permits  automatic 
translation  Into  English  of  texts  written  in  French  in  tne  field 
of  aircraft  maintenance. Aftar  a  brief  account  of  the  architecture 
of  the  system,  tne  author  indicates  its  main  performance 
characteristics  as  measured  In  that  application  and  then  considers 
what  economic  conclusions  should  be  drawn  from  this  first 
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full-scale  experience  regarding  the  development  of  *  linguistic 
software  industry  for  ••computvr“-**aic«d“  “translation** 
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The  following  topics  wore  cealt  with;  tne  conference  aimed  to 
examine  tne  wnoie  flow  of  information  from  Industrial  producers  to 
tnose  wno  use  tne  incustrial  procucts.lt  induced  current 
practices  and  recent  development  concerning  documentation  for 
offsnore  platforms,  puildng  industry,  aviation,  and  electronics 
industry. Topics  covered  included*  tne  aims  of  tne  Frencn  national 
oroject  of  “computer**  ••aided**  ••translation**,  and  some 
holistic  and  sociodynamic  aspects  of  industrial  product 
documentation. Abstracts  of  individual  papers  can  o«  found  under 
tne  relevant  classification  codes  In  tnls  or  other  Issues 
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The  paper  alms  at  presenting  a  conceptual  analysis  of  a  new 
••computer ••-••aided**  “translation**  system  (CATS)  paradigm, 
taking  into  consideration  Basic  numan  information  processing 
capaDll it les. Another  feature  of  tne  suggested  aporoacn  is  that  it 
allows  DMncloa*  hardware  Implementation  of  diffe-ent  input  text 
analysis  phases,  thus  eliminating  tne  need  of  tne  large, 
complicated  resident  software  used  for  languag*  parsing,  aiming  at 
radical  CATS  arcnltecture  cnanges  The  icea  is  to  overcome  extant 
hardware  limitations  t>y  using  the  advantages  of  parallel  access 
and  associative  information  processing  in  holographic  storage  media 
C7820 
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years,  out  how  practical  is  it  and  how  can  it  oe  uest  used?The 
author  investigates  the  latest  developments  . 

C7820 

language  translation 

text  translat1on;comouter  aided  translation 


C88021478 

A  xorean-Engllsn  macnine  translation  system  based  on  lexical 
association  grammar 

Proceedings  of  TENCON  87;  1987  IEEE  Region  10  conference 
'Conouters  and  Communications  Tecnnology  Toward  2000' 
(Cat,N0.87CH2423-2> 

Seoul.  South  Korea 
25-28  Aug.  1987 

LEE  J.  K. j  HAN  $,  K.l  vEE  S.  H. 

Oept.of  Electron. Eng..  Inha  Univ,,  Incheon.  South  Korea 

lEEEtKorea  Inst.Electron.Eng.'Uinlst.Corniriun.tet  al 

Conference  paper 

Practical 

ENG 

KR 

IEEE;New  York.  NY,  USA 

NP.  3  vol.  1380;  PP,  516-19  vol.2:  7  Ref,;  OP,  1987 
CH2423-2/87/0000-05 16*01.00 

The  tmpienentatlon  of  a  Korean-English  machine  translation  system 
is  described. To  solve  the  complexity  of  machine  translation 
problems  effectively,  a  lexical  association  grammar  (LAG)  is 
proposed  that  is  based  on  the  analysis  of  a  language  cognltton 
system. as  lag  nas  both  universality  to  represent  the  conceptual 
structure  of  language  and  particularly  to  generate  tne  surface 
structure,  it  is  especially  effective  for  translation  between 
different  language  families. The  translation  process  is  explained 
In  detail 
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This  paper  discusses  some  central  issues  of  the  Questions  of  the 
semantic  valency  of  nouns  in  the  context  of  the  multilingual 
machine  translation  project  Eurotra.lt  discusses  some  of  the  most 
influential  positions  which  are  known  from  the  ltngutstlc 
literature. It  outlines  approaches  to  non  de-vtroal  nouns  and  to 
de-verbal  nouns  wnicn  are  not  identical,  out  within  one  overall 
theoretical  perspective.The  treatment  of  semantic  elements  which 
are  not  valency  bound  Is  discussed. Interesting  Questions  connected 
with  an  implementation  of  tne  framework  are  examined  as  well  as 
areas  of  research  and  implementation  wnicn  are  necessary 
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Looks  at  an  English-to-Ciech  macnine  translation  system  called 
APAC3-2,The  text  is  divided  into  three  cnapters.  the  first  of 
whlcn  presents  a  o-ir  survey  of  the  theoretical  linguistic 

'tware  tools  used. Chapter  2  is  devoted  to 
features  of  the  system,  of  its  Halts, 
ative)  representation  of  factual 
now  aooiguities  are  being  solved,  of  the 
fail-soft  -#  Used,  and  so  on. a  detailed  description  of  the 
present  $h**~  of  the  program  is  contained  in  Chapter  3 
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••Computer**  ••aided**  **tran$Jat1on“  has  been  around  for  many 


C8 80 15567 

The  dictionary  In  tne  Eurotra  engineering  framework 
MAAS  H.  D. 

IAI,  Saarbrucken,  West  Germany 

Journal  paper 

Practical 

ENG 

OE 

Spracne  Oitenverarp.(west  Garmany) 

VOL.  11;  NO.  1;  PP.  15-21;  ORef.:  OP,  1987 

SPOADH 

0343-5202 

The  author  gives  an  overview  of  tne  lexicon  framework  as  worked 
out  by  tne  Dictionary  Task  Force  of  Eurotra. The  article  reflects 
the  state  of  the  art  by  the  end  of  March  1987. it  covers  wnat 
generators  oojthe  place  of  tne  inner  dictionary  in  a 
generator treatment  of  frames {treatment  of 
idloms;stfucture-to-feature  transla-ionsdlctionar/ 
coding  {relations  between  dictionaries  of  different  leveH;and 
examples  for  using  the  dictionary  In  analysis  and  In  generation 
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General-purpose  database  management  systems,  whose  structure  is 
built  in.  art  not  an  appropriate  solution  to  situations  where 
problems  of  translation  or  artas  of  research  cannot  fit  bounded  tn 
aovanct,  for  example,  when  lexicography  and  linguistic  research 
art  closely  related. Consequently,  an  original  system  nas  fietn 
otvtiootd.  ana  is  being  tool  ltd  to  linguistic  and  lexlcocrapnlcal 
data  on  tn«  Somali  language. Tnt  collaborative  project  ltd  to  tne 
creation  of  an  automatic  Itxlcogranmatlcal  data  management 
systea.lne  basic  hardware  provided  for  this  application  was-a 
Plessey  LSI"! 1/03  microcomputer  with  a  64-Kbyte  memory,  two 
i-Koyte  floppy  disk  drives,  a  video  terminal,  and  a 
printer. Alt hough  originally  designed  for  the  Somali  language,  the 
system  is  independent  of  any  language  and  of  any  specific 
application. First  tne  authors  present  the  linguistic  context  ana 
the  computer  context. Then  they  cescrioe  the  system  itself,  with 
special  emphasis  on  the  original  aspects. Finally  seme  examples  of 
work  sessions  are  presented 
C7820:  C0J6O 
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Text  processing  in  the  Leningrad  research  group  'Speech 
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The  article  discusses  som>«  semiotic  informational  aspects  of 
language  along  with  their  interpretation  in  terms  of  computational 
linguistics. The  paper  describes  and  indicates  the  resolution  of 
some  semiotic  and  linguistic  paradoxes  whicn  create,  at  present,  a 
rejecting  barrier  between  natural  language  and  the  computer 
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'SPSS'-an  alQOMtnm  and  data  structures  design  for  a  macnine  aided 
Englisn-to-Czecn  translation  system 
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Some  of  the  problems  caused  by  ambiguity  in  machine  translation 
systems  are  out  lined. Approaches  to  systea  design  are  discussed.Tne 
author  describes  a  system  which  offers  alternative  translations  of 
ambiguous  words  and  pnrases  to  a  user, The  task  is  examined  from 
the  linguistic  point  of  view. Tne  data  structures  and  principles  of 
tne  algorithm  are  presented 
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Research  and  development  in  computerized  translation  is  conducted 
in  two  areas:  ••cc«bUter**-**i1doa**  ••translation**  performed  by 
human  translators  (using  computerized  terminology  dictionaries  and 
data  banks)  and  automatic  translation  performed  by  computers 
assisted  by  human  translators. There  nas  oeen  a  growing  realization 
that  fully  automatic  translation  can  be  achieved  only  in 
exceptional  cases  where  a  very  general  idea  of  tne  contents  of  the 
original  would  be  sufficient  for  the  user,  o'*  if  tne  original 
texts  have  a  simple  ano  standard  format. In  all  the  other 
situations,  researchers  are  Decoding  increasingly  aware  tnat 
translation  programs  have  to  rely  on  human  participation  at 
certain  points  in  the  process 
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Features  of  processing  of  unidentified  words  in  a  machine 
translation  system 
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An  efficient  mechanism  for  recognizing  grammatical  characteristics 
of  keywords  has  been  developed  for  machine  translation  ano  other 
automatic  text -processing  systems, By  matching  parts  of  words 
(mostly  word  endings)  with  similar  parts  of  words  sorted  in  the 
system,  the  syntactic  category  and  some  of  the  grammatical  (and 
also  semantic)  characteristics  are  determined. £ff iclent  operation 
of  this  algorithm  rakes  it  possible  to  continue  with  the  analyjis, 
but  it  ooes  hot  produce  tne  translation  of  the  partially 
Identified  word. Tne  new  component  of  tne  unidentified  wore 
processing  algorithm  has  two  tasks:  separating  the  group  of 
unidentified  words  by  dividing  It  into  new  worcs  and  misspelled 
words  (diagnosis  of  distortions)  and  reconstructing  the  prototypes 
of  misspelled  words  (distortion  correct  ion), Both  tasks  are 
accompanied  with  a  common  strategy  based  on  a  formal  description 
of  the  word's  grapnematics,  i.e.regu'artt les  in  the  combinations 
of  graphemes  in  words 
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GUILBAUO  J.  P. 

GETA.  univ.5el.et  Med.de  Grencole.  Saint-Martin  o*H«res,  France 

Conference  paper 
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NP.  xlx*675:  PP.  405-7:  4  Ref.;  OP.  1986 
Grammatical  categories  useo  in  a  translation  moael.Arlane  are 
formalised,  ana  tne  variable  of  the  metalanguage  used  to  describe 
the  source  ano  target  languages  of  the  model  are 
discussed. Variables  in  the  linguistic  structure  interface,  SLI  and 
other  grammatical  categories  are  considered  under  seven 
headings. Particular  attention  is  given  to  the  structure, 
morphology  ana  syntax  of  tne  German  language  in  this  context 
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Strategies  and  nturistlcs  in  tne  analysis  of  a  natural  language  in 
macnine  translation 

nth  International  Conference  on  Computational 
Linguistics. Proceedings  of  Collng  '86 
Bonn,  Germany 
25-29  Aug.  1986 
YUSOFF  Z. 

Groupe  d'Etode-,  pour  la  Traduction  Autoo.,  Grenoble  Unlv„, 

Saint-Mart in-d'Herts,  France 

Conference  paper 
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NP,  xlx*675;  PP.  138-9;  12  Ref.;  OP.  1986 

Tne  analysis  pnase  in  an  indirect,  transfer  and  global  *aproach  to 
macnine  translation  is  studied. The  analysis  conducted  can  be 
described  as  exhaustive  (meaning  with  backtracking),  depth-first 
and  strategically  ana  neurlsticaliy  driven,  while  the  grammar  used 
is  an  augmented  context  free  grammar.Tne  problem  areas,  being 
pattern  matening,  ambiguities,  forward  propagation,  checking  for 
co-rectness  and  backtracking,  are  high! Ignted. Established  results 
found  in  the  literature  are  employed  whenever  adaptable,  while 
suggestions  are  given  otherwise 
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The  translat ion  method  c*  Rosetta 
LSERMAKtfcS  R.j  ROUS  J. 
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The  authors  explain  jvr.i  -aotl.ate  the  translation  method  of  the 
Rosatta  project. They  present -a  stepwise  way  of  unravallng  tha 
vartous  aspects  of  machine  translat ion. Tho  general  strategy  Is  to 
vie*  a  translation  system  as  being  composed  of  an  analysis  part 
ana  a  gtnaratton  part  connected  Dy  a  transfar  module,  and  to  oraak 
oo*n  tha  lattar  In  a  systematic  way. They  do  this  oy  racaatadly 
Identifying  tasks  inside  tha  transfar  module,  which  can  tJ  moved 
as  new  modules,  with  w#n-aafinad  intarfacas,  into  tna  (initially 
tmpty)  analysis  and  generation  parts. Thus.  By  each  suen  move 
analysts  ano  generation  are  augmented  with  a  deeper  level  In  a 
clear  way,  ana  tha  transfar  task  is  reduced  accordingly 
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Sywnetrtc  rules  for  transHtlon  of  Engllsn  ana  Chinese 
WANYING  JIN;  SIW-OnS  R,  P, 
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A  system  of  grammars  using  symmetric  Phrase  structure  ana 
translation  rules  In  a  Liao  version  of  Prolog  is  shown  to  provide 
symmetric  Bidirectional  translation  Between  English  ana  Chinese 
for  a  fragment  of  tna  two  Unguagos.lt  Is  argued  that  symmetric 
grammars  and  translation  rules  significantly  reduce  the  total 
grawar  writing  requirement  for  translation  systems,  ana  that 
research  on  symmetric  translation  systems  deserves  further  stuay 
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Linguistic  ana  extra-linguistic  knowledge;  a  catalogue  of 
language-related  rules  ana  their  computational  application  in 
machine  translation 
SCHUBERT  K, 

Journal  paper 
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The  autnor  gives  an  overview  of  the  proBIems  encountered  in 
translating  a  text. Language  Is  a  rule-governed  system. Language 
science  Is  the  discovery,  translation  ana  application  of  these 
rules. But  while  a  human  translator  can  use  the  rules  Intuitively, 
the  application  of  a  computer  involves  the  necessity  of 
formulating  the  rules  explicitly  Translation  requires  rules  about 
Both  inside  ana  outside  influences  on  language;tnese  rules  in  turn 
presume  knowledge  about  those  language  related  influences.After 
looking  at  the  theoretical  basis  of  this  view  he  describes  tne 
practical  details  of  tne  OLT  machine  translation  system  starting 
from  tne  search  for  rules  and  knowledge. He  sums  up  the  rule 
systems  and  relates  tnea  to  the  types  of  knowledge  they 
require. This  concordance  of  rules  and  knowledge  leaas  into  a 
discussion  of  three  characteristic  features  of  the  DLT  system 
which  sight  seem  controversial,  But  which  can  then  be  shown  to  be 
strictly  related  to  the  rules  and  knowledge  needed  for  machine 
translation. finally  tne  question  of  priority  for  either 
language-specific  or  extraiinguistic  rules  and  knowledge  is  taken 
up 
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To  determine  the  'structure*  of  an  Input  sentence  is  the  major 
problem  of  computational  lingulstlcs.wnen  the  'structure*  to  be 
obtained  finally  is  regarded  as  the  structure  required  to 
represent  the  results  of  understanding,  most  of  the  studies  on 
computational  linguistics  (including  meaning  processing  and 
context  processing)  can  be  summarized  into  the  analysis  of  a 
natural  language  sentence. Tne  basic  frames  for  sentence  analysis, 
such  as  added  context-free  grammar.  tree  structure 
transformational  grammar,  etc. have  been  applied  to  practical 
systems,  including  mecnanlcal  translation  systems.The  author 
describes  the  current  state  and  the  basic  problems  concerning  tne 
studies  of  analysis  methods  for  a  natural  language  sentence 
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Computational  linguistics  has  been  studied  since  tne  development 
of  comouters  and  has  developed  along  with  studies  of  mecnanlcal 
translation. Mechanical  translation  was  intensively  studied  from 
tne  Utter  naif  of  1950s  but  the  US  Congress  concluded  in  1965 
tnat  mecnanlcal  translation  could  not  be  materialized  in  a  snort 
period  and  that  basic  scientific  studies  on  languages 
(computational  linguistics)  should  be  promoted  (ALPAC 
Report), Computational  linguistics  Includes  the  Molds  of 
acoustics,  phonetics,  phonology,  morphology,  lexicology,  syntax, 
semantics,  pragmatics,  discourse,  recognition  and  understanding, 
synthesis  and  generation,  dialectology,  trarslatlon.  documentation 
writing  aids,  stylistics,  content  analysis,  Information  retrieval, 
office  automation,  Instruction,  computer  Interfaces,  graphics, 
speech,  sign  languages  and  animal  languages 
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Building  on  the  well-established  premise  that  reliable  machine 
translation  reaulres  a  significant  degree  of  text  comp renens ion, 
this  paper  presents  a  recent  advance  in  oultl-llngual 
knowledge-cased  macnine  translation  (KBMT).unl Ike  previous 
approaches,  the  current  method  provides  for  separate  syntactic  ana 
semantic  knowledge  sources  that  are  Integrated  dynamically  for 
parsing  and  generatton.Such  a  separation  enables  tne  system  to 
have  syntactic  grammars,  language  specific  but  domain  general,  and 
semantic  knowledge  bases,  domain  specific  but  language 
general.Subseouently,  grammars  and  domain  knowledge  are 
precompiled  automatically  In  any  desired  combination  to  produce 
very  efficient  and  very  thorough  real-time  parsers. A  pilot 
implementation  of  tne  kbmt  arcnitecture  using  functional  grammars 
ana  entity-oriented  semantics  demonstrates  the  feaslDllity  of  the 
new  approach 
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The  articl#  presents  a  linguistic  model  for  language  understanding 
and  describes  its  application  to  an  experimental  machine 
translation  system  called  LUTE. The  languaga  understanding  rccel  Is 
an  Intaractlva  model  between  the  memory  struct ura  ana  a  text. The 
memory  atructura  is  niararcnical  ana  raorasantaa  in  a 
fra*o-natwork.Ungulstle  ana  non-llnguistlc  knowledge  is  storta 
ano  tne  rasult  of  unaerstanaing  tna  taxt  Is  assimilated  into  tfta 
memory  structure. Tha  understanding  process  Is  intaractlva  in  that 
tna  taxt  invokes  knowledge  ana  the  unaerstanaing  procaaura 
Interprets  the  taxt  by  using  that  knowledge. A  linguistic  model, 
callaa  tna  axtanaaa  casa  structure  mocel,  Is  aafinea  by  adopting 
tnraa  klnas  of  Information:  structure,  ralatlon  ana  concept. Tnasa 
three  are  used  recursively  ana  Iteratively  as  the  basis  for  memory 
organizatlon.Tnese  principles  are  applied  to  the  cesign  ana 
Implementation  of  the  LUTE  xnici  translates  Japanese  Into  Engllsi 
ana  vice  versa 
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result  and  evaluation 

llth  International  Conference  on  Computational 
Linguistics. Proceedings  of  Collng  '88 
Bonn.  Germany 
25-29  Aug.  1966 

HANAKATA  K. :  LESNIEWSKI  A. I  YOkOYAUA  S. 

Inst. fur  Inf,,  Stuttgart  Unlv. .  Germany 

Conference  paper 

Practical 

ENG 

DC 

Inst.Angewandte  Komuntkatlons  A  Spracnforschung;6onn,  Germany 
np,  xlx*675:  PP.  560-2:  4  Ref.;  OP.  1986 
Project  SEmsyn  has  achieved  a  state  where  a  prototype  system 
generates  German  texts  on  the  oasis  of  the  semantic  representation 
produced  from  Japanese  texts  by  ATLAS/I I  of  Fujitsu 
Laboratory. This  paper  descrices  some  problems  that  are  specific  to 
tn#  semantic  cased  approacn  ana  some  results  of  the  evaluation 
study 
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NP,  x1x*675;  PP.  464-9:  20  Rif.;  OP,  1986 
Controlled  active  procedures  are  productions  that  are  grouped 
under  and  activated  by  units  called  'scouts' .Scouts  are  controlled 
by  units  called  'missions',  union  also  select  reievmt  sections 
from  the  data  structure  for  rule  apol icat loo. Following  the  proolea 
reduction  method,  the  parsing  problem  is  subdivided  into  ever 
smaller  subproblems,  each  one  of  wnicn  is  represented  by  a 
mission. Inw  elementary  problems  are  represented  oy  scouts. The  CAP 
grammar  formalism  is  based  on  experience  gained  wltn  nature! 
language  analysis  and  translation  by  computer  in  the 
Sender forsenungspere ten  JOO  at  the  University  of  Saarorucken.tne 
pacer  introduces  CAP  as  a  means  cf  linguistic  engineering 
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Model  for  lexical  knowledge  base 
llth  International  Conference  on  Computational 
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Describes  a  model  for  a  lexica)  knowlecge  base  (LKB).An  LKB  is  a 
knowledge  base  management  system  (KB«$)  which  stores  various  kinds 
of  dictionary  knowledge  In  a  uniform  framework  and  provlces 
multiple  viewpoints  to  the  stored  krowieage.XBMSs  for  natural 
language  knowledge  will  pe  fundamental  components  of  knowledgeable 
environments  where  non-computer  professionals  can  use  various 
kinds  of  support  tools  for  document  preparation  or 
translation. However,  basic  models  for  such  RBMSs  have  not  been 
established  yot.Tnus,  the  authors  propose  a  moce!  for  an  lkb 
focusing  on  dictionary  knowledge  ruch  as  that  ootatned  from 
machine-readable  dictionaries 
C6160;  C7820 
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Acquisition  of  knowledge  data  by  analyzing  natural  language 
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Automatic  identification  of  homonyms  in  kana-to-kanji  conversion 
systems  and  of  multivocal  words  in  machine  translation  systems 
cannot  be  sufficiently  implemented  by  the  mere  combination  of 
grammar  and  word  diet looarlrt. This  calls  for  a  new  concept  of 
knowledge  data.wnat  the  new  knowledge  data  is  ano  now  it  can  pe 
Acquired  are  mentioned  In  the  paper. In  natural  language  researen. 
active  discussion  has  been  made  within  the  framework  of  knowledge 
and  samoles  of  knowledge 
C7820:  C4290;  C6130 
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A  crucial  test  for  any  mt  system  is  its  power  to  solve  lexical 
aaoigultles.The  size  of  the  lexicon,  its  structural  principles  ana 
the  availability  of  txtra-l inguistlc  knowledge  are  the  most 
Important  aspects  in  this  respect. The  paoer  outlines  the 
experimental  development  of  the  SWES1L  system;  a  structured 
lux icon-based  word  expert  system  designed  to  play  a  pivotal  role 
in  the  process  of  distributed  language  translation  (DLT)  which  Is 
being  developed  in  tne  Netherlands. It  presents  SWESIL's  organizing 
principles,  gives  a  snort  description  of  the  present  experimental 
set-up  and  snows  now  SwESIL  is  being  tested  at  this  moment 
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In  the  framework  of  macnine  aided  translation  systems,  two  types 
of  lexical  knowledge  are  used,  'natural'  and  'formal',  tn  the  fora 
of  online  terminological  resources  for  human  translators  or 
revisors  and  of  cooed  dictionaries  for  machine  translation 
proper. a  new  organization  is  presented,  wnicn  allows  both  types  to 
be  integrated  in  a  unique  structure,  called  'fork'  integrated 
dictionary,  or  FIO.A  Qlven  FID  Is  associated  with  one  natural 
language  ano  may  give  access  to  translations  into  several  other 
languages. The  fids  associated  to  languages  Li, and  L.!  contain  all 
information  necessary  to  generate  coceo  dictionaries  of  M(a)T 
systems  translating  from  LI  into  L2  or  vice-versa 
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NP.  xix«675:  pp.  412-171  19  Ref«:  DP.  1S86 

CRITAC  (critiquing  using  accumulated  knowledge)  Is  an  experimental 
expert  system  for  proofreading  Japanese  text. It  detect*  mistypes, 
Kana-to*Kanjl  miscoovcrsions,  and  stylistic  errors.Tnts  system 
cocaines  Prolog-cocad  Heuristic  knowledge  witn  conventional 
Japanese  text  processing  tecnnlques  whlcn  involve  neavy 
computation  and  access  to  large  language  databases 
C7106:  C7820 
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Nowadays,  mt  systems  grow  to  suen  a  size  tnat  a  first 
specification  step  is  necessary  if  one  wants  to  be  able  to  master 
their  development  and  maintenance.  for  tne  software  part  as  well 
for  the  linguistic  part  ('llngwares*). Advocating  for  a  clean 
separation  between  linguistic  tasks  ano  progra-rmlng  tasks,  the 
paper  Introduces  a  specif Ication/lnpiesentatlon/vaHcat  ion 
framework  for  NLP  tnen  SCSL,  a  language  for  thn  specif Icatlon  of 
analysis  and  generation  modules 
Ceuoo;  C7820 
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NP.  xix*$75:  PP.  390-2:  0  Ref.:  DP.  1986 
PerlPnrase  Is  a  high-level  computer  language  developed  by 
A. L.P. Systems  to  facilitate  parsing  and  structural  transfer.lt  is 
designed  to  speed  the  development  of  computer-assisted  translation 
systems  and  grammar  checkers. Tne  syntax  and  semantics  of  this  tool 
are  described  together  with  its  integrated  development  environment 
C6I400:  C7820 
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TEXAN  is  a  system  of  transfer-oriented  text  analysis. Its 
linguistic  concept  is  based  on  a  communicative  approach  within  the 
framework  of  speech  act  theory. in  this  view  texts  are  considered 
to  be  tno  result  of  linguistic  actions. It  Is  assumed  that  they 
control  the  selection  of  translation  equivalents. The  transition  of 
this  concept  of  linguistic  actions  (text  acts)  to  the  model  of 
computer  analysts  Is  performed  by  a  context-free  elocution  grammar 
processing  categories  of  actions  and  a  propositional  structure  of 
states  of  affairs, Tne  grammar  which  is  related  to  a  text  lexicon 
provides  the  connection  of  these  categories  and  the  linguistic 
surface  units  of  a  single  language 
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Tne  authors  are  designing  an  Engllsh-to-Jacanese  interactive 
machine  translation  system. For  development  purposes  we  aro  using 
an  existing  corpus  of  10000  words  of  continuous  prose  from  tne  ICL 
PERO’s  graphics  documentation; in  the  long  term,  tne  system  will  be 
extended  for  use  by  technical  writers  in  fields  other  than 
software. The  authors  have  developed  system  development  software, 
user  Interface,  and  grammar  and  dictionary  handling  utllltles.The 
English  analysis  grammar  handles  most  of  tne  syntactic  structures 
of  tne  corpus,  and  there  are  a  range  of  formats  for  output  of 
linguistic  representations  ano  Japanese  text. A  transfer  grammar 
for  Eng lisn- Japanese  has  been  prototyped,  but  is  not  yet  fully 
adequate  to  handle  all  constructions  in  the  corpus ;a  facility  for 
dictionary  entry  in  kanji  is  incorporated. The  authors  focus  on  its 
interactive  nature,  discussing  tne  range  of  different  types  of 
Interaction  which  are  provided  or  permitted  for  different  types  of 
user 
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Tne  authors  present  a  new  computing  model  for  constructing  a 
two-way  simultaneous  interpretation  system  between  Korean  ana 
Japanese. They  also  propose  several  methodological  approacnes  to 
the  construction  of  a  two-way  simultaneous  interpretation  system, 
and  realize  tne  two-vay  Interpreting  process  a*  a  model  unifying 
both  linguistic  competence  and  linguistic  performance. Tne  model  Is 
verified  theoretically  and  through  actual  applications 
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Rosetu  is  an  experimental  translation  systen  wnlch  uses  an 
intercalate  language  ana  translates  between  Dutcn,  English  ana, 
in  tne  future,  Scanfsn.Tne  theoretical  framework  of  Rosetta  which 
is  based  on  isomorphic  u-grarmars  is  out  lined, Idioms  are  then 
discussed  in  this  framework, Finally  seme  examples  are  considered 
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NP.  xlx*675s  PP.  313-18;  12  Ref.;  OP.  1986 
This  paper  discusses  the  translation  of  temporal  expressions,  m 
tne  framework  of  tne  machine  translation  system  Rosetta. The 
translation  method  of  Rosetta,  tne  'isomorphic  grammar  method',  is 
Based  on  Montague's  Coooosltlonallty  Principle.lt  Is  shewn  that  a 
compositional  appreacn  leads  to  a  transparent  account  of  tne 
complex  aspects  of  time  in  natural  language  and  can  Be  used  for 
tne  translation  of  temporal  expressions 
C7820;  C4290 
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The  author  investigates  valency  theory  as  a  linguistic  tool  in 
macnine  translation. There  are  three  main  areas  in  which  major 
questions  arise. (i)  Valency  theory  itself. He  sketches  a  valency 
theory  in  linguistic  terms  wnich  includes  the  discussion  of  the 
nature  of  dependency  representation  as  an  interface  for  semantic 
descriotlon.(2)  The  dependency  representation  in  the  translation 
process.He  sketches  tne  different  roles  of  dependency 
representation  in  analysis  and  generation. (3)  Tne  implementation 
of  valency  theory  in  an  MT-syst#*.H#  gives  a  few  examples  for  how 
a  valency  description  could  Be  Implemented  in  tne  Eurotra-formalism 
C7820s  C4290 
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Analysis  and  generation  of  clauses  within  the  Eurotra-fr&mework 
proceeds  tnrougn  tne  levels  of  (at  least)  Eurotra  constituent 
structure  (ECS),  Eurotra  relational  structure  (ERS)  and  interface 
structure  (IS). At  IS,  labelling  of  nodes  consists  of  labelling* 
for  time,  modality,  ser-antic  features,  semantic  relations  and 
others. In  this  paper,  we  snail  oe  concerned  exclusively  with 
semantic  relations  (SRs)  or  participant  roles  (PR). In  Eurotra-O. 
they  nave  been  experimenting  with  a  set  of  SRs,  or  PRs,  which  are 
identified  with  the  help  of  syntactic  criterla;7hls  approach  Is 
outlined 
C7620;  C4290 

computational  linguist les; language  translation 
clause  generation; semantic  structure$;constituent  structure; 
relational  structurejtnterface  structure. "semantic  relations; 
participant  roles;syntact1c  criteria 


C8702845B 

The  (C.A),  T  framework  in  Eurotra:  a  theoretically  committed 

notation  for  mt  (machine  translation) 

ltth  International  Conference  on  Computational 

Linguistics. Proceedings  of  Collng  '86 

Bonn,  Germany 

25-29  Aug.  1986 

ARNOLD  0.  J. ;  KRAUWER  $.;  ROSNER  M. ;  OES  TOMBS  L«;  VARILE  G.  6. 

Essex  UMv..  Colchester,  England 

Conference  pacer 

Theoretical  mathematical 

ENG 

CB 

Inst. Angewandte  Kommunlkatlons  6  $pricnforschung:Bonn,  Germany 
NP.  xlx*875;  PP.  297-303;  10  Ref.;  OP,  1986 
This  paper  describes  a  mode)  fer  ur,  developed  within  the  Eurotra 
mt  project,  based  on  the  idea  of  compositional  translation,  by 
describing  a  basic,  experimental  notation  which  embodies  the 
idea. Some  of  the  theoretical  and  practical  implications  of  the 
model.  Including  some  concrete  extensions,  and  some  more 
speculative  aspects  are  discussed 
C4290;  C7820 

computational  linguistics (language  trans1atlon;r«search  initiatives 
Eurotra  MT  project ;comoostt Iona 1  translation 
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Linguistic  develooeents  in  Eurotra  since  1963 
nth  International  Conference  on  Computational 
Linguistics.Proceeolngs  of  Collng  '86 
Bonn,  Germany 
25-29  Aug.  1986 
JASPAERT  L. 

Katholiekt  Unlv, Leuven,  Belgium 

Conference  paper 

Genera) 

ENG 

BE 

Inst. Angewandte  Kowaunlkatlons  &  Spracnforschung;Bonn,  Germany 
NP.  xlx*675;  PP.  294-6;  6  Ref.;  OP.  1986 

The  autno.  puts  the  theory  and  metatheory  currently  aoooteo  in  the 
Eurotra  project  into  a  historical  perspective,  indicating  where 
and  why  changes  to  Us  basic  design  for  a  transfer-based  macnine 
translation  (TBMT)  system  have  been  made 
C7820:  C4290 

computational  linguistics  language  translatloojresearcn  initiatives 
I ingulst l cs: monostratal  dimensional lty;Eurotra  project  j transfer 
based  machine  translation 
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lltn  International  Conference  on  Computational 

Linguistics. Proceedings  of  Collng  '86 

Uth  International  Conference  on  Computational 

Lingulstlcs.Proceedfngs  of  Collng  '86 

Bonn.  Germany 

25-29  Aug.  1988 

Conference  proceedings 

Practical:  Theoretical  mathematical 

ENG 

ZZ 

Inst. Angewandte  Kommunlkatlons  U  SprachforschungsBonn,  Germany 
np.  xlx'675:  OP.  1986 

The  following  topics  were  dealt  wlths  sign-tneoretlcai  model  of 
semantic  structure (computational  analysis; linguistic 
semantic* (temporal  relat1ons;term  associations  in  automatic 
information  retrieval (lexical  cata;PerlPhrase,  llngwar*  for 
parsing  and  structural  transfer :$C$L,  linguistic  specification 
language  for  MT;ATN  programming  environ«ent:CRlTAC,  Japanese  text 
proofreading  system  integer  codes  for  text  storage^etaText, 
event-driven  text  processing  and  text  analysing  system; integrated 
dictionaries  for  M(a)T;word  database  for  national  language 
processingsautomatlc  thesaurus  construct  Ion; and  functional 
structures  for  parsing  dependency  constralnts.Abstracts  of 
individual  papers  can  be  found  under  the  relevant  classlf icatlon 
codes  in  this  or  other  issues 
C4290;  078 20 

computational  linguistics; language  translatlon;natural  languages 
sign  theoretical  model (semantic  structure (computational  analysis; 
linguistic  semantlc*:temporai  relations; tern  associations; 
automatic  Information  retrieval (lexical  datasPeriPhrase slingware; 
oarsingjstructural  transfer;SCSL;linguistic  specification  language 
M7 :ATN  programming  environment jCRIIAC; Japanese  text  proofreading 
system;tnteger  cod«s:text  storage;Betalext;event  driven  text 
orocessing;ttxt  analyzing  systea; integrated  dictionaries;!!  a  T; 
word  oat abase (national  language  processing;auto®atic  thesaurus 
construct  loo (functional  structure*; dependency  constraints 
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An  information  mocel  for  a  macM*w  translation  system 
LEONT  EVA  N.  N. 

Journal  paper 

Practical:  Theoretical  mathematical 

ENG 

ZZ 

Naochno-Tekn.Inf.Sor.2  (USSR) 

Autoo.Ooc.6  uath. Linguist. (USA) 

VOL,  19;  NO.  10;  PP..22-9;  S  Ref,;  OP.  1985 

VOL.  19;  NO.  5;  PP,  92-105;  OP,  1985 

NIPS8P 

AOJfi-AE 

0548-0027 

0005-1055 

0005- 1 055/8 5/S 20. 00 

Th#  French-Russian  machine  translation  systea  developed  at  the 
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All-Union  Translation  Canter  is  cased  on  an  information  model  for 
understanding  any  natural-language  text. The  paper  deals  witn  the 
external  ana  internal  reasons  for  cnoostng  this  node l,  together 
with  the  composition  of  It.  xnlch  involves  a  discussion  of  certain 
key  problems  in  the  linguistic  sucDort.lt  is  shown  that  the 
strategy  of  combining  Information  and  translation  functions  at 
first  sight  complicates  the  tasx,  whereas  in  fact  it  relieves  some 
load  on  the  system  and  makes  the  task  feasible,  while  flexible 
links  between  the  components  mean  that  one  can  adapt  the  system  to 
difference  topic  areas  and  various  information  requirements  on  the 
text 

07820}  C4290 

computational  Unguistics:language  translatlonsnaturai  languages 
natural  language  text  understanding* Information  model ;Fr«ncn 
Russian  machine  translation  systea:Ali  union  Translation  Center: 
linguistic  support; information  requirements 
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Semantic  modules  in  a  machine  translation  system;  complex  term 
analysis 

BELYAEVA  L.  N.;  MATCR1NA  L.  V.j  PIOTROVSKII  R.  G. :  YASHCHEW10  T.  V. 
Journal  paper 

Practical;  Theoretical  mathematical 

ENG 

Z2 

Nauchno-Tekh.Inf.Ser.2  (USSR) 

Autom.Doc.l  i*ath. Linguist. (USA) 

VOL.  19;  NO,  4;  PP.  29*34;  8  Ref.:  OP.  1985 

VOL.  19:  NO,  4:  PP.  52*61:  OP.  1985 

NIPSBP 

ADMLAE 

0548*0027 

0005-1055 

C005-1055/85/S20.00 

A  basic  task  handled  in  the  machine  translation  MT  of  scientific 
texts  is  to  extract  the  basic  meaning  from  tne  incut  text  with 
minimal  loss  and  transmit  It  correctly  by  means  of  output -language 
facilities. The  solution  is  largely  determined  by  the  correct 
identification  of  compound  terms  (noun  terminological  work 
combinations)  in  the  text,  followed  oy  analysis  and 
translation.TMs  is  so  because  these  word  comoinatloos  contain  the 
main  professional  information  in  the  text  and  reflect  the  basic 
scientific  concepts  in  tne  area  of  knowledge  represented  by  it. The 
speecn-statistlcs  group  has  designed  and  implemented  four 
approaches  to  the  nacnlne  translation  of  noun  word  combinations, 
which  nave  been  operated  routinely  and  on  an  experimental  basis: 
translating  noun  terminological  comoinations  as  a  whole  without 
analysis  by  reference  to  a  pnrasai  dlctionary;translat ion  on  the 
basis  of  a  component  analysis  for  each  wordform  in  the  noun  word 
coablnation:matr1x-fr«e  translation:thesaurus-network  translation 
C4290:  C7820 

computational  llnguistlcs:language  translation 
semantic  modules :scmiot1cs;engineerlng  linguist ics^-aputatipnal 
linguist ics;machlne  translation-.complex  term  anaiysis;sclentif ic 
texts; compound  teres;noun  terminological  work  ccmolnat Ions; speech 
statlstics:noun  wore  cocolnations;phrasal  dictionary {Component 
analysls;matrlx  free  trans1ation:tnesaurus  network  translation 
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Oiscussion  session  on  macnine  translation 
Alvty/ICL  Workshop  on  Linguistic  Tneory  and  Comouter 
Applications. Transcripts  of  Presentation  and  Oiscussions 
(CCl/uWST-86/2) 

Manchester,  England 
Stbt.  1985 

JOHNSON  R.;  WHITELOCK  P.(Ed,);  SOWERS  H.(Ed.):  BENNETT  P,(Ed.); 
J(*iN$0N  R.  (Ed. ) 

WOOD  U.  M. (Ed. ) 

Centre  for  Computational  Linguistics,  univ.of  Manchester  Inst. of 

Scl.Tecnnol.,  England 

Conference  paper 

Practical 

ENG 

G8 

Uni v. Manchester  Inst. Set. &  Tecnnol; Manchester,  England 
NP,  215:  PP.  169*89;  8  Rtf.:  OP,  March  1986 
MT  raises  some  Quite  interesting  theoretical  methodological 
Questions  which  haven't  really  been  raised  up  to  now. The  author 
concentrates  on  that  particular  collection  of  issues. Tho  author 
assumes  thar  MT  systems  are  extensive,  that  translation  Is  from 
natural  language,  they  apply  to  more  than  two  languages,  they 
don't  reouire  human  intervention  ana  they  are  linguistic  based 
C7820 

computational  linguist les; language  tran$latlon;natura)  languages 
macnine  translation;MT  systems-.natural  language 
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Alvey/ICL  workshop  on  Linguistic  Theory  and  Computer 
Applications. Transcripts  of  Presentation  and  Oiscussions 
(CCL/UMIST-86/2) 

Alvey/ICL  workshop  on  Linguistic  Theory  and  Comouter 
Applications. Transcripts  of  Presentation  and  Discussions 
(CCL/UUIST-86/2) 

L’ancnester,  England 
Sept.  1985 

WHITELOCK  P,(EC.);  SOMERS  H,(E0. );  BENNETT  P.(£d.);  JOHNSON 
R.(EdJ;  W000  M.  M.(EC.) 

Conference  proceedings 

Practical 

ENG 

ZZ 

Uni v. Manchester  inst.Sci.i  Tecnnol; Manchester,  England 
NP.  215;  DP.  March  1986 


The  following  topics  were  dealt  with:  linguistic  analysis  and 
linguistic  theoryjcefault  lnhoritanee;deterainlstlc 
par smg grammars ;eacnine  trans)atlon:lexicons:and  syntax  and 
semantics. Abstracts  cf  individual  papers  can  be  found  under  the 
relevant  classification  codes  in  this  or  other  issues 
C7820:  C42io 

computational  linguistics; grammars language  transUtion:natural 
languages 

natural  languages ;Aivey; linguistic  analysis, linguistic  theory: 
default  lhherltance:oeterministic  oarslng;grammar$, macnine 
translation;  lex  Icons -.syntax  {semantics 
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••Computer**  ••aided**  ••translation**  project.  University  Sains 
Malaysia.  Penang,  Malaysia 
WAR0TAMAS1KKHADIT  U. 

Journal  paper 
General 
ENG 
ZZ 

Compute  Trans), (USA) 

VOL.  Ic  NO.  2;  PP.  113:  0  Rtf.;  OP.  Aprl)-june  1986 

COMTES 

0884-0709 

Research  in  CAT  started  in  1978  with  development  of  grawar  models 
for  English-ualay  translation  using  the  software  tool  ariane.a 
basic  translation  system  with  a  vocabulary  of  1000  lexical  units 
was  comoleted  in  1982, in  1984.  a  permanent  CAT  project  unit  was 
estaplisned,  and  a  laboratory  prototype  for  Engiisn-Ualay 
translation  was  successfully  tested  In  1985. The  £ng)isn-Thai 
machine  translation  project  in  Thailand  was  established  in  June 
1981. Two  committees  have  been  appointed  to  undertake  this  task, 
tne  Engllsh-Thai  translation  research  project,  and  th«  That 
structures  research  project  for  Engllsn-Thai  machine  translation 
using  tne  ariane  system 
C7820 

language  translation 

English  Malay  transiation-.software  tool  AR!AN£;£ngi isn  Thai 
machine  translation  project 
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Linguistic  research  in  tne  Greek  qpouo  (for  Eurotra  project) 

TSITSOPOULOS  S. 

journal  paper 

General:  Practical 

ENG 

ZZ 

Multilingua  (Netherlands) 

VOL.  5;  NO.  3;  PP.  149-51;  8  Ref,;  DP.  1986 

MULTDF 

0167-8507 

0167-8507/86/S2.00 

work  for  Eurotra  in  Greece  began  in  a  double  vacuum,  the  lack  of  a 
substantial  body  of  theoretical  work  on  tne  Greek  language 
inspired  py  contemporary  linguistic  paradigms,  and  tne  total 
absenco  of  ongoing  programmes,  academic  or  otherwise,  in  any 
branch  of  computational  linguistics. These  infrastructural 
deficiencies,  normally  distinct  and  with  independent  histories, 
converge  disconcertingly  in  an  ur  project 
C7820 

language  translation;! tngulstics;naturai  languages 
linguistic  researcn;Greek  grouo:Eurotra  project ;Gree'.  xnguage. 
linguistic  paradigms '.computational  linguist  ics;MT  project 
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Eurotra:  general  overview 
PERSCHKE  S. 

CEC,  Luxembourg 
Journal  paper 
Genera) 

ENG 

LU 

Multi  lingua  (Netherlands) 

VOL.  5;  NO.  3;  PP.  134-5:  0  Ref.;  DP,  1986 

MULT OP 

0187-8507 

0167-8507/86/S2.00 

Eurotra  is  a  multilingual  machine  translation  project  carried  out 
by  tne  Commission  of  tne  European  Communities. Tne  article  snows 
not  only  tne  intrinsic  scientific  interest  ana  amoltion  of  the 
Eurotra  project,  but  also  its  imoact  on  the  future  of 
computational  linguistics  in  Euroce 
C7820 

language  translatton:)lngutstlcs 

multilingual  machine  translation  project sCoran  -  c,  of 

European  Communities; Eurotra  project scomout at  »)  g,  c t 
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Sentence  disambiguation  by  asking 

TCMITA  U. 

Dept. of  Comput.Scl. •  Carnegie-Meilon  Univ,,  PH  -f  o3A 
Journal  paper 

Practical;  Tneoretical  mathematical 

ENG 

US 

Comput.6  7rarsl,(USA) 

VOL,  Is  NO.  1;  PP.  39-51;  9  Ref.;  OP.  Jan.-March  1986 
0384-0709 

Describes  a  technique  for  asking  Questions  to  disambiguate  a 
sentence.Sucn  a  disambiguation  tecnniQue  is  crucial  for 
interactive  machine  translation  systems,  and  helps  resolve 
structural  aroiguities.lne  snarec-packec  forest  representation  and 


B26 


the  forest  shaving  algorithm,  along  with  tM  efficient  parsing 
algorithm.  enable  us  to  pars*  ana  disambiguate  nignly  ambiguous 
sentence  with  Hundreds  of  partes  without  dealing  with  hundreds  of 
individual  parse  trees 
07820;  C4210:  C4290 

computational  linguist ics {grammars slanguage  translation 
sentence  disambiguation. disambiguation  technique; interactive 
machine  translation  syste=s;structural  ambiguities ;$nared  packed 
forest  reprasentation;forest  shaving  algoritrw.eff icient  parsing 
algorlthn; parse  trees 
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Language,  sublanguage,  and  the  promise  of  machine  translation 
BARON  N.  s. 

Southwestern  Unlv..  Georgetown,  TX.  USA 
Journal  pacer 

Practical;  Theoretical  mathematical 

£NG 

US 

Co«put,&  Trans). (USA) 

VOL,  I;  NO,  1;  PP.  3- 19 ;  9  Rif.;  OP.  Jan  -March  1986 
0884-0709 

Looks  at  machine  translation  (and  at  natural  languag©  processing 
more  generally)  in  context  of  a  model  of  linguistic 
communication, In  developing  this  model,  the  author  discovered 
strong  parallels  between  human-human  communication  on  the  one  hand 
and  human-machine  (or  human-human  communication  mediated  by 
machines,  as  in  the  case  of  machine  translation)  on  tne  other 
C7820;  C4290;  C4210 

computational  linguistics; forma)  languages; language  translation; 
natural  languages 

human  machine  comreunicationjaachln#  translationjnatural  language 
processing; linguistic  communlcation;huean  human  comwnicat  ion 
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Review  of  text-to-speecn  conversion  technology 
SAG IS AKA  Y„;  SATO  N. 

NTT  Res.Labs..  Musashlno.  Japan 

Journal  paper 

Practical 

JAP 

Jp 

J.Acoust.Soc.Jpn, (Japan) 

VOL.  41;  NO.  12;  PP.  901-5;  42  Ref  :  OP.  Oec.  1965 

NlOGAH 

0369-4232 

This  paper  is  concerned  with  text-to-speech  conversion  technology, 
including  text  analysis  and  speech  syntnesis.For  text  analysis 
necessary  for  speech  syntnesis,  nign-graoe  semantic  analysis  is 
needed  for  ••coffputer**-**aiced*»  ••translation* ’.However,  it  is 
necessary  to  Improve  the  semantic  analysis  technology  to  some 
extent  for  efficient  application 
66130;  C5585 
speeen  synthesis 

text  to  soeecn  conversion  technology; speech  synthesls:computer 
aided  translatlonrsemantic  analysis 
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Augmented  dependency  grammar:  a  simple  Interface  between  the 
grammar  rule  and  the  knowledge 

Second  Conference  of  the  European  Chapter  of  the  Association  for 
Computational  Linguistics. Proceedings  of  the  Conference 
Geneva.  Switzerland 
27-29  March  1985 

MURA K I  K.;  ICMYAUA  S.;  FUXUMOCHl  Y. 

CSC  Systems  Res. Lab..  NEC  Coro..  Kawasaki,  Japan 
Assoc. Coaput .Linguist les 
Conference  paper 

Practical;  Tneorttlcal  mathematical 

ENG 

JP 

ASSOC.Ccmput . Linguistics; Morristown.  NJ.  USA 
NP.  vii*270:  PP.  198-204;  2  Ref.;  OP.  1985 
This  paper  describes  some  operational  aspects  of  as  language 
comprehension  model  which  unifies  the  linguistic  theory  and  the 
semantic  theory  with  respect  to  operations. The  computational 
model,  called  augmented  dependency  grammar  (AOG),  formulates  not 
only  the  linguistic  dependency  structure  of  sentences  but  also  the 
semantic  dependency  structure  using  the  extended  deep  case  grammar 
and  field-oriented  fact -know ledge  based  interferences. Fact 
knowledge  base  and  AOG  model  clarify  the  Qualitative  difference 
between  wnat  we  call  semantics  and  logical  meaning. From  a 
practical  view  point.  It  provides  clear  image  of 
syrttactlc/seaantic  computation  for  language  processing  in  analysis 
and  synthesis.lt  also  explains  the  gap  in  semantics  and  logical 
meaning, .and  gives  a  clear  computational  image  of  what  Is  called 
conceptual  analysis 
C4290;  C4210;  C7820 

computational  linguist Icsjgramears; language  translation 
accented  dependency  g-awftar;int*rfjc*:gramnar  rule  knowledge : 
operational  asoects;language  comprehension  mooch  linguistic  theory; 
semantic  theory;co«putat ional  model ;A0G;st*ant ic  dependency 
structure -language  processing 
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Ah  English  generator  for  a  case-labelled  dependency  representation 
Second  Conference  of  the  European  Chapter  of  the  Association  for 
Computational  Llngufstlcs.Proceedlngs  of  the  Conference 
Geneve,  Switzerland 
27-29  March  1965 
TAIT  J.  I. 

Acorn  Computers  Ltd.,  CaaoMcge,  England 


assoc, Cooput. Linguistics 

Conference  paper 

Practical 

ENG 

05 

ASSOC. Cooput. LlngulStlCStMorMStOwn.  NJ,  USA 
NP.  vil*270;  pp»  194-7:  12  Ref.:  OP,  1985 
The  paper  describes  a  program  which  has  been  constructed  to 
produce  English  strings  from  a  case-labelled  dependency 
recresentation.The  program  uses  an  especially  simple  and  uniform 
control  structure  with  a  well  defined  separation  of  the  different 
knowledge  sources  used  during  generation. Furthermore,  the  majority 
of  the  system's  knowledge  Unexpressed  in  a  declarative  fora,  so 
in  principle  the  generator's  knowledge  bases  could  be  used  for 
purposes  other  than  generation. The  generator  uses  a  two-pass 
control  structure,  the  first  translating  from  tne  semantically 
orientated  case  'ed  dependency  structures  into  surface 
syntactic  trees  and  i*..  second  translating  from  these  trees  into 
English  strings 
C7820;  C4290 

computational  linguistics; language  translation 
English  generator;cas#  labelled  dependency  r©presentation;program; 
control  structure {knowledge  bases; two  pass  control  structure; 
surface  syntactic  trees 
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A  proci^’ M Stic  parser 

Second  Conre,«<ico  of  the  European  Chapter  of  tne  Association  for 
Computational  Linguistics. Proceedings  of  the  Conference 
Geneva.  Switzerland 
27-29  Marcn  1985 
GARS IDE  R.;  LEECH  F, 

Lancaster  Unlv.,  England 
Asscc. Coaput. Linguist ics 
Conference  paper 

Practical;  Theoretical  mathematical 

ENG 

G8 

Assoc. Cooput. Linguist ics. Morristown.  NJ,  USA 
NP,  vil*276;  PP.  166-70;  6  Ref.:  OP,  1985 
The  UCREL  team  at 'tne  University  of  Lancaster  is  engagedMn  tne 
development  of  a  robust  parsing  ,3ecnanl so.  which  will  assign  tne 
appropriate  graswatical  structure  to  sentences  in  unconstrained 
English  text. The  techniques  usee  involve  tho  calculation  of 
probabilities  for  competing  structures,  and  art  based  on  tta 
tecnnlQues  successfully  used  in  tagging  (i.e. assigning  grammatical 
word  classes)  to  the  108  (Lancaster-Oslo/Bergen)  corpus. Th*- first 
step  in  the  parsing  process  Involves  dictionary  lookup  of 
successive  pairs  of  gra-waticiiiy  tagged  woros,  to  give  a  number 
of  possible  continuations  to  tne  current  parse. Since  this  lookup 
will  often  not  be  able  unawb'goously  to  distinguish  tne  point  at 
which  a  grammatical  constituent  should  be  closed,  tne  second  step 
of  the  parsing  process  will  nave  to  Insert  closures  and 
distinguish  between  alternative  parses.lt  will  generate  trees 
representing  these  possible  alternatives,  insert  closure  points 
for  the  constituents,  and  compute  a  probability  for  each  parse 
tree  fro®  tne  procab! li?v  of  eacn  constituent  within  tne  t fee 
C7820:  C4290 

computational  l!ngu!stics:grawrars:languape  translation 
L08  cofpussprooabDlst.-  parser ;UCREL, University  of  Lancaster; 
rooust  parsiry  oechaniss^graiftaticai  structure;unconstralned 
English  text .diet -ouary  1ookup;trees;Closur*  points 
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A  probabilistic  approach  to  grammatical  analysis  of  written 
English  by  computer 

Second  Conference  of  the  European  Chapter  of  the  Association  for 
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work  at  the  Unit  for  Cooouter  Research  on  the  English  Language  at 
the  University  of  Lancaster  has  been  directed  towards  producing  a 
grammatically  annotated  version  of  th*  Lancaster-0slo/6«rg*n  (L08) 
Corpus  of  written  British  English  texts  as  the  preliminary  stage 
In  developing  computer  programs  and  data  files  for  providing  a 
grammatical  analysis  c?  unrestricted  English  text. work  is  now  in 
progress  to  devise  a  suite  of  programs  to  provide  a  constituent 
analysis  of  the  sentences  In  the  corpus. So  far,  sample  sentences 
nave  been  automatically  assigned  phrase  and  clause  tags  using  a 
probabilistic  system  similar  to  word  tagging.it  Is  hoped  that  tne 
entire  corpus  will  eventually  be  para-d 
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Lextfanis  if  a  software  tool  designed  and  implemented  oy  the 

authors  to  analyze  modern  Greek  language. This  system  assigns 

grawaaticai  classes  (parts  of  sceecn)  to  95-98X  of  the  words  of  a 

text  which  is  read  and  normalized  by  the  coeputer.By  providing  the 

system  with  the  appropriate  gra:*atlcal  knowledge  (l.e, ; 

dictionaries  of  non-inf lected  words,  affixation  morphology  and 

Halted  surface  syntax  rule3)  any  'variant'  of  modern  Greek, 

language  (dialect  or  idle©)  can  be  processed 
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This  pacer  is  concerned  with  tne  specifications  ano  the 
implementation  of  a  particular  concept  of  word-pased  lexicon  to  be 
used  for  Urge  natural  language  processing  systems  such  as  machine 
translation  systems,  and  ccepares  it  with  the  morpheme -based 
concept (on  of  the  lexicon  traditionally  assumed  In  computational 
Ilngu1stlcs.lt  Is  argued  that,  although  less  concise,  a  relational 
wora-ba$«d  lexicon  is  superior  to  a  morpheme-based  lexicon  from  a 
theoretical,  computational  and  also  practical  viewpoint 
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The  metal  machine  translation  project  incorporates  two  methods  of 
structural  transfer-direct  transfer  and  transfer  oy  grawrar.The 
author  discusses  the  strengths  and  weaknesses  of  these  two 
approaches  in  general  and  with  respect  to  the  metal  project,  *no 
argues  that,  for  many  applications,  a  combination  of  the  two  is 
preferable  to  either  alone 
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Tne  Linguistics  Researcn  Center  (LRC)  at  tne  University  of  Texas 
at  Austin  is  currently  developing  metal,  a  fully-autcoatlc  high 
quality  wchinc  translation  systea.Tnis  paper  describes  tho  * 
current  status  of  metal,  eepnaslxing  the  result,?  of  the  most 
recent  post-editors'  evaluation,  and  briefly  indicates  seme  future 
directions  for  the  system. A  six-page  German  original  text  and  a 
raw ‘(unedited,  but  automatically  reformatted)  metal  translation  of 
that  text  into  English  are  Included  as  aopenoleos 
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The  following  topics*were  dealt  with;  natural 
languages {grammar st machine  translat ion {parsing: Boolean 
oceratorstlexlcal  database: probabilistic  parser ;cooout at  local 
theory {database  Queries {automated  speech  recognition {sentence 
production  model ;user  erodelllngtdtalog  structure,  and  dialog 
strategy  in  Haa-Anstconaunicatlve  context,  of  dialogue  interaction 
structure. Abstracts  of  Individual  papers  can  oe  found  under  the 
relevant  classification  codes  In  this  or  other  issues 
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Tne  distributed  language  translation  project  employs  a  modified 
subset  of  Esperanto  as  an  Intermediate  language  (!L)  for  machine 
translation  of  information  text. between  natural  languages. Text 
entered  in  the  source  language  (SL)  is  analysed  syntactically  by 
an  SL  module  and  then  passed  to  an  XL  module  for  semantic 
oisaaoiguatlon.The  task  of  the  semantic  module  is  to  identify  the 
most  plausible  syntactic  parse. interleaved  (online)  semantics,  in 
which  the  syntactic  aro  semantic  modules  nave  a  syoolotic 
relationship,  is  cmployed.word  meanings  are  represented  by 
semantic  vectors.. and  plausibility  is  expressed  u^ing  2aden's 
test-score  semantics  and  fuzzy  logic  teenntoues. Design  principles 
are  ceveloped  on  a  basis  of  the  literature  on  psycholinguistics, 
semantics  and  conputatloral  linguistics  ana  an  expert  system 
design.  In  Prolog,  is  presented. a  further  aspect  of  the  work  is  an 
expert  system  for  consistency-control,  using  vectorial 
cross-checking  technloues.A  simplified  subset  of  tne  system  is 
implemented  in  MlcroProlog,  and  some  preliminary  results  on  the 
disambiguation  of  twenty-four  meanings  of  'Tire  flies'  (without 
syntactic  pre-processing)  provide  grounds  for  encouragement  in  the 
further  development  of  the  system 
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Elements  of  the  history,  state  of  tne  art,  and  prooable  future  of 
machine  translation  (MT)  are  discussed. The  treatment  is  largely 
tutorial. The  paper  covers  somo  of  the  major  MT  R&O  groups,  tho 
general  techniques  they  employ,  and  the  roles  they  play  in  the 
development  of  the  field.The  conclusions  concern  tn*  seeming 
permanence  of  the  translation  problem,  and  potential 
re-lntegratlon  of  MT  with  mainstream  Computational  Linguistics 
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The  syntax  analyser  mechanism  possesses  two  distinguishing 
characteristic  features;  the  grammars  of  the  two  languages  ana  the 
programs  are  separated  strictly,  and  a  presage  oecnaniso  and  an 
exceeding  search  oecnaniso  are  used  instead  of  the  technology  of 
'Backup*  in  parsing. Ineoret leal ly,  tne  construction  process  of 
tnese  two  mechanisms  is  described,  an  algorithm  of  Building  this 
presage  eecnanism-a  presage  analysis  taolf  is  given,  and  a  series 
of  teennicues  to  improve  tne  presage  aoility  in  tne  graenar 
reformat  ion  are  introduced.  In  order  to  prove  the  feasIBUlty  of 
tne  design,  the  authors  nave  Built  a  syntax  analyzer  used  in  tne 
tngiish-Cninese  translation  system.  and  ootained  important 
verification  data 
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Descents  the  organization  of  the  interaction  Between  syntactic 
and  semantic  structures  in  a  system  for  automatic  translation  from 
french  to  Russian  (fRAT)  developed  at  the  All-union  Translation 
Center  (AUTS).Tne  fRAT  language  apparatus  is  based  on  two 
metalanguages  used  for  linguistic  analysis:  a  syntactic  language 
describing  the  form  of  the  utterance,  ana  a  semantic  language 
explicating  the  content. Information  about  the  sentence  is 
extracted  gradually:  first  the  syntactic  analysis  system 
constructs  the  primary  syntactic  representation  of  the  sentence 
without  using  seminttc  information  to  co  sojthen  the  semantic 
analysis  systea  constructs  the  primary  semantic  representation, 
interpreting  the  primary  syntactic  representation  of  the  sentence 
In  terms  of  the  semantic  metalanguage  using  syntactic-semantic 
dictionaries. The  primary  representat Ions  complement  each  other, 
ana  tr>*  syntactic  metalanguage  Is  used  for  exchange  of 
Informal «on, Tne  article  describes  the  initial  stage  of  the 
interaction  of  these  two  representations  of  tne  sentence 
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This  paper  compares  two  approacnes  to  the  modelling  of  human 
discourse  and,  more  particularly,  dialogue. Both  place  themselves 
within  a  general  information  processing  paradigm,  and  both  descend 
for  the  insights  of  Grice  (1975)  that  understanding  is  a  matter  of 
inference  free  what  is  said  and  what  Is  assumed. So  general  Is  that 
assumption  no,  and  so  widespread  are  the  disciplines  that  draw 
upon  it  (pnilosoony,  psychology,  linguistics  and  artificial 
intelligence  (All)  that  it  is  hard  to  capture  briefly  except  in 
opposition  to  the  transformational-generative  paradigm  of 
language,  with  its  notions  of  the  primacy  and  autonomy  of  syntax, 
and  the  theoretical  primacy  of  explications  of  competence  over 
those  of  performance 
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Tne  metal  machine  translation  system,  a  joint  project  of  tne 
Linguistic  Research  Center  and  Siemens,  has  oeen  released  for  use 
as  part  of  marketed  translation  systems. The  systea,  which 
presently  translates  technical  German  into  English,  is  an 
outgrowth  of  a  traditional,  generative  approach  to  automatic 
analysis  and  synthesis  of  natural  language  phenomena  carried  on  at 
the  Llnguistlcs'Research  Center  for  many  years.Jn  Its  present 
manifestation,  it  is  a  modular  design  consisting  of  purely 
monolingual  lexicons,  transfer  lexicons,  and  an  augmented  phrase 
structure  grammar  The  grammar  is  powerful  encugn  to  constrain 
application,  to  build  new  nodes  with  essentia)  characteristics  of 
tneir  sons  ana  new  synthetic  information  as  well,  ana  to  perform 
transformat  ions  to  re-orcer.  delete,  and  create  constituents.The 
parser  is  enhanced  to  allow  application  of  rules  in  levels,  and 
eliminating  unlikely  patns  via  preferential  weightings  calculated 
from  lexical  and  grammatical  data. The  METAL  system,  conceived  in 
recent  years  as  destined  for  implementation,  has  an  orientation  to 
user  interface  wnicn  includes  sophisticated  text  stripping, 
unfound  word  handling  and  reconstitution,  and  a  convenient  means 
of  working  with  the  lexicons  interactively 
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This  paper  reflects  aDout  the  kinds  of  morphological,  syntactic, 
semantic,  and  pragmatic  knowledge  needed  to  process  ill-formed 
input  The  authors  conclude  that  an  excellent  start  on  processing 
IH-forejd  input  has  been  exemplified  in  a  number  of  concrete 
implementations,  but  that  a  substantial  amount  of  fundamental  work 
oust  still  be  done  if  systems  are  to  understand  language  rooustly 
to  tne  degree  that  humans  co. Furthermore,  they  conduce  tnat 
studying  m-f<rmed  language  offers  important  perspectives  on  the 
knowledge  and  architecture  needed  to  correctly  understand  natural 
languages 
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This  paper  p-ovioes  an  overview  of  a  research  program  defined  at 
Bellcore. Tne  ©ojective  is  to  develop  facilities  for  working  with 
large  document  collections  that  provide  more  refined  access  to  the 
information  coota'ned  in  these  'source'  materials  than  is  possible 
through  current  information  retrieval  procedures. Tne  tools  Delng 
used  for  this  purpose  are  macnint- readable  dictionaries, 
encyclopedias,  and  related  'resources'  that  provide  geographical, 
blograpntcai,  and  other  kings  of  specialized  knowledge. a  major 
feature  of  the  research  program  is  the  exploitation  of  tne 
reciprocal  relationships  between  sources  and  resources. Tnese 
interactions  between  texts  ano  tools  arc  intended  to  support 
experts  wno  organize  and  use  information  In  a  workstation 
envifonnent.Two  systems  under  development  are  described  to 
illustrate  the  approach;  on#  providing  capabilities  for  full-text 
suDject  assessment: tne  otner  for  concept  elaboration  whilo  reading 
text. Progress  in  the  researen  depends  critically  on  developments 
in  artificial  intelligence,  computational  linguistics,  and 
information  science  to  provide  a  scientific  base,  and  on  software 
engineering,  database  management,  and  distributed  systems  to 
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provide  the  technology 
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A  proposal  of  external  specification  of  the  user  environment  for 
the  EUROTRA  project  is  presented. The  needs  of  tne  users  and  toe 
functions  whlcn  are  necessary  for  any  efficient  testing 
environment  are  analyzed 
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For  a  linguistic  model  It  Is  necessary,  first  of  all.  to  define 
the  mapping  between  tne  strings  of  words  of  a  language  and  tneir 
structural  organisation,  given  tnat  with  transducers  there  aro 
many  ways  of  obtaining  tne  same  result  using  different 
strategies. This  mapping  called  a  static  grammar  is  independent  of 
tne  analysis,  generation  or  whatever  strategy  adooted. Moreover  tho 
formalism  of  a  static  grammar  Is  not  affected  by  the  choice  or 
number  of  interpretation  levels. The  author i;  present  a  static 
grammar  formal  ism. Using  this  formalism,  any  given  language  can  be 
described  as  a  series  of  'charts' -Each  'chart'  describes  how  a 
certain  group  of  strings  corresponds  to  the  structure  associated 
with  this  group  of  strings  (this  structure  is  a  valid  and  compute 
substructure  of  the  linguistic  model). Tne  structures  of  all  the 
sentences  of  a  language  for  a  given  linguistic  model  can  be 
described  by  means  of  a  series  of  cnart  Inter-references. The 
static  grammar  («  used  as  4‘base  for  writing  dynamic  analyses  and 
generation  modules,  however,  the  static  grammar  does  not  concern 
Itself  with  strategic,  combinatorial,  ambiguity  problems  or  the 
choice  of  structures  related  to  dynamic  grammars 
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Most  existing  practical  machine  translation  systems  are  designed 
to  translate  documentation,  such  as  technical  papers  and 
manuals. However,  there  Is  a  growing  need  for  translating  not  cnly 
large  texts  but  also  personal  short  texts  suen  as  letters  and 
Informal  mos sages .Tne. convent  local  machine  translation  systems, 
wnlcn  art  intended  to  translate  large  texts,  are  not  very  suttaDle 
for  these  kinds  of  small  Joes. One  needs  an  interactive  system 
which  has  a  totally  different  design  pnilosopny. This, paper 
describes  the  design  philosophy  of  personal/ interact  We  machine 
translation  system,  and  studies  Its  feasibility 
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The  work  described  here  was  the  consecuence  of  the  idea  tnat  the 
authors  wanted  to  maxo  a  new,  more  interesting  theoretical  start 
in  EUROTRA.lt  is  preliminary  and  not  fully  developed  yet:tt  should 
be  seen  as  the  reflection  of  a  way  of  thinking  about  MT. Currently, 
they  are  making  it  more  precise,  and  experimenting  with  it, They 
sketch  tne  general  outlines  of  the  new  EUROTRA  framework 
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The  paper  addresses  the  issue  of  cooperation  between  linguistics 
and  natural  language  processing  (NLP),  in  general,  and  between 
linguistics  and  macnlne  translation  (MT),  in  particular. It  focuses 
on  just  one  direction  of  such  cooperation,  namely  applications  of 
linguistics  to  NLP,  virtually  ignoring  for  now  any  possible 
applications  of  NLP  to  linguistics,  wnlcn  can  range  frew  providing 
computer-based  research  tools  and  aids  to  linguistics  to 
implementing  formal  linguistic  theories  ana  verifying  linguistic 
models. The  author  deals  with  the  Question  why  linguistics  rust  be 
applied  to  NLP  and  what  the  conseauences  of  ignoring  it  are. He 
provides  a  counterpoint  by  discussing  now  linguistics  snould  not 
be  applied  to  nip  and.  by  contrast  and  inference,  how  it  should 
be. He  narrows  the  discussion  cown  to  one  promising  approach  to 
nlp.  the  sublanguage  deal,  and  the  interesting  ways,  in  which 
linguistics  can  be  utlllztd  within  a  limited  sub language. He 
discusses  the  things  linguistics  can  contribute  to  MT 
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The  author  explores  some  difficult  Questions  related  to  toolcs  in 
discourse  analysis  (henceforth  oa)  and  offers  a  partial  solution 
to  so «e  of  them.in  particular,  he  addresses  the  issue  of  levels  in 
DA  and  how  the  various  approaches  taken  within  the  field  can  ce 
classified  according  to  a  leveled  mocel.He  then  considers  an 
approach  for  representing  the  semantics  of  discourse,  and 
considers  now  it  fits  in  to  th#  proposed  model  for  da 
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in#  lnt«f lingua  approach  to  machine  translation  (mt)  Is 
characterized  By  tn#  following  two  stages:  (i)  translation  of  tn# 
sourc#  text  into  an  Intermediate  representation,  an  artificial 
language  (lnt«rllngua)  which  is  designed  to  capture  tn#  various 
types  of  waning  of  tn#  source  text  and  {2)  translation  from  the 
inter lingua  into  tn#  target  text. Over  the  ysars  a  nucoer  of  MT 
projects  tried  to  c#v#loo  interl ingua-Based  systems. In  th«$# 
projects  the  amount  of  linguistic  and  encyclopaedic  knowledge  used 
to  produce  intermediate  representations  was  Quite  Halted  However, 
even  at  tnat  level  difficulties  connected  with  encoding  knowledge 
seemed  overwhelming. The  TRANSLATOR  project  at  Colgate  university 
Benefits  from  recent  developments  in  knowledge  representation 
teenni cues. The  text  of  its  interlingua  text  reflects  syntactic, 
lexical,  contextual,  discourse  (including  speech  situation)  and 
pragmatic  meaning  of  tne  input. This  paper  discusses  the  lexicon 
and  grammar  of  tne  Interl ingua  used  in  TRANSLATOR,  and  touches 
upon  tne  structure  of  tne  pi  lingual  (source  language  to 
inter! ingua)  dictionaries 
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Tne  author  outlines  tho  Mu  mt  project,  looking  at  tne  general 
principles  and  tne  linguistic  fraaework.He  then  discusses  tn# 
transfer  from  Japanese  to  English,  looking  at  dependency 
structure,  target  language  wore  selection,  gioaa!  sentential 
structures  and  Inference  and  context 
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kmen  studied  as  a  source  of  insight  into  tn#  human  language 
faculty,  ratner  than  to  construct  a  commercially  useful  service, 
mechanical  translation  (NT)  is  carried  out  By  coupling  an 
otherwise  normal  natural  language  parsing  system  to  a  normal 
natural  language  generation  system. The  author  proposes  that  a 
crucial  capaolllty  has  Been  omitted  from  tne  design  of  the  parsers 
that  nave  Been  used  to  date,  namely  a  facility  for  recognizing  tne 
information  tnat  is  implicit  in  tho  fora  of  any  well  written 
texi:matters  of  emphasis,  wnetner  a  fact  is  new  or  old,  whether  a 
relationship  is  given  explicitly  or  left  as  an  oovlous  inference, 
signals  of  intended  moves  in  the  discourse,  and  other  things  of 
this  sort .He  claims  that  mechanical  translations  are  'mechanical' 
principally  Because  they  pay  no  attention  to  information  of  tnfs 
sort,  and  propose  tnat  tnis  can  t*  dealt  with  by  incorporating 
Into  tne  parser  knowledge  of  the  relationship  Between  usage  and 
for*  of  the  sort  that  is  comonplace  in  any  modern  language 
generation  system 
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The  author  descriBes  a  machine  translation  system,  lmt,  Based  in 
PROLOG,  translating  from  English  to  German. Tne  effort  on  lmt  per 
se  has  Just  begun  this  year,  although  the  logic  programming 
methodology  for  the  analysis  of  tne  source  (English)  goes  back 
several  years 
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well-known  examples  suen  as  8ar-HU lei's  (i960)  'Tne  box  is  in  the 
pen'  illustrate  that  extensive  semantic  analysis  is  necessary  to 
resolve  ambiguities  tnat  must  be  resolved  in  machine 
translation, if  one  accepts  tne  premise  that  semantics  should  be 
added  to  tne  analysis  techniques  used  in  machine  translation,  what 
is  tn#  way  in  which  it  should  be  added?Tnis  paper  argues  for  an 
integrated  approach  to  semantic  processing. That  is.  syntactic  and 
semantic  processing  should  take  place  at  the  same  time,  ratner 
tnan  tn  separate  stages.However,  although  tn#  author  argues  for 
tne  integration  of  syntactic  and  semantic  analysis  processes,  he 
also  argues  for  tne  use  of  a  separate  body  of  syntactic  knowledge, 
and  for  building  a  separate  syntactic  representation  during  tne 
parsing  process. Tnis  is  in  contrast  to  previous  integrated 
parsers,  which  nave  relied  almost  exclusively  on  semantic 
representations  to  guide  the  parsing  process,  and  which  nave  not 
used  a  separate  body  of  syntactic  rules 
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Tnis  paper  addresses  three  Questions:  wnat  is  sublanguage? ;wny  1$ 
sublanguage  analysis  imoortant  for  automatic  translation?;and  how 
can  a  translation  system  take  advantage  of  sublanguage 
properties7Tno  first  of  these  auestloos  appears  to  nave  a  simple 
answer. Nat oral  languages  clearly  nave  specialized  varieties  which 
are  used  in  reference  to  restricted  subject  matter. One  speaks,  for 
example,  of  tne  'language  of  cneoistry'  to  mean  a  loosely  defined 
set  of  sentences  or  texts  dealing  with  a  particular  part  of 
reality. But  when  one  considers  tne  automatic  translation  of 
specialized  language,  one  is  forced  to  be  mere  prec-ise-Ono  must 
describe  sublanguages  as  coherent,  rule-based  systems  The  attempt 
to  write  grammars  for  special-purpose  sublanguages  raises  a  numoer 
of  tneoretical  and  practical  problems,  which  are  only  now  being 
discussed. But  since  tne  only  path  to  nign-cuality  automatic 
translation  seems  to  lie  through  sublanguage  (at  least  durino  the 
next  decade  or  two),  one  has  no  choice  but  to  solve  these  problems 
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The  case  against  fully  automatic  high  Quality  macnine  translation 
(FAHCUT)  has  been  well -canvassed  in  tne  literature  ever  sine* 
ALPAC. Although  considerable  progress  In  computational  linguistics 
has  been  mas*  sine#  then,  many  of  th#  major  arguments  against 
fahqut  still  nolo. Accepting  that  Fahc«T  is  not  possiolo  in  th# 
current  stat#  of  the  art.  it  is  both  feasible  and  desirable  to  s#t 
up  RSO  programmes  In  UT  whien  can  both  produce  results  wnicn  will 
satisfy  sponsors  ana  provlc*  an  environment  to  support  research 
directed  towards  bringing  mt  closer  to  the  ultimate  goal  of  Fahqmt 
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The  paper  describes  the  SOCG  (semantic  definite  clause  grammars), 
a  formalism  for  natural  language  processing  (NLP),  and  the  XTRA 
(English  Chinese  sentence  translator)  oacnin#  translation  (MT) 
system  based  on  tt.Tne  system  translates  general  domain  English 
sentences  into  grammatical  Chinese  sentences  in  a  fully  automatic 
eanner.lt  is  written  in  Prolog  ana  Implemented  on  the  OEC-JO,  tn# 
GEC.  and  the  SUN  workstatlon.SOCG  Is  an  augmentation  of  OCG 
(definite  clause  gramars)  which  in  turn  is  based  on  CFG  (context 
free  gramars), Implemented  in  Prolog,  the  SOCG  is  highly  suitable 
for  nij>  in  general,  and  Ml  in  particular. A  wide  range  of 
linguistic  phenomena  is  covered  by  the  XTRA  system,  including 
multiple  work  senses,  coordinate  constructions,  the  prepositional 
phrase  attachment,  among  others 
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MT  systems  integrate  many  advanced  concepts  from  the  fields  of 
computer  science,  linguistics,  ana  a!;  specialized  languages  for 
linguistic  programing  cased  on  production  systems,  co«o)ete 
linguistic  programing  environment,  multilevel  representations, 
organization  of  tno  lexicons  around  'lexical  units',  units  of 
translation  of  the  sire  of  several  paragraphs,  possibility  to  use 
text-driven  heuristic  strategies. The  authors  are  now  beginning' to 
integrate  new  techniques;  unified  design  of  an  'integrated' 
lexical  oata-Dase  containing  tne  lexicon  in  'natural'  and  'coded' 
fof*.  uso  of  the  'static  grammars'  formalism  as  a  specification 
language,  and  design  of  a  kind  of  structural  retaeditor  (driven  by 
some  static  grammar)  allowing  the  interactive  construction  of  a 
document. This  paper  centers  on  the  study  on  possible  additions  of 
expert  systems  equipped  with  metalinguistic  and  extral ingulstic 
knowledge,  in  order  to  solve  some  problems  encountered  in 
second-generation  MT  systems. Several  examples  of  the  possible  uso 
of  expert-corrector  systems  In  M(a)T  (machine  (aided)  translation) 
systems  are  given 
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Machines  translation  (MT)  systems  nave  historically  relied  upon 
explicit  grammars  tn  order  to  analyze  tne-source  text  and 
reproduce  It  in  the  target  language. The  authors  argue  for  a  style 
of  mt  in  which  tne  focus  of  processing  is  at  the  level  of  the 
lexicon,  rather  than  the  grammar. This  approach  to  translation 
allows  an  analyzer  to  mao  source  sentences  into  an  interlingual 
form,  which  then  can  be  mapped  (perhaps  after  intermediate 
inferenclng  steps)  back  into  target  sentence! s)  wpicn  are 
paraphrase-equivalent  to  the  original. Advantages  of  the  approach 
includes  (i)  the  possibility  for  different  paraphrases  of  tne 
original; (2)  the  capability  for  multi-sentence  expression  of  the 
original  when  no  single  work  (e.g.a  verb)  exists  in  the  target 
language  which  spans  the  same  meaning  complex  as  a  word  tn  tn* 
souree;(3)  a  uniform  approach  to  word  sense  disambiguation  ana 
anapnoric  reference  f*soiutioo:ana.  most  imoortantly.  (4)  the 
possibility  for  robust  handling  of  ungrammatical  and  ellipses 
sou-ce  text 
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The  current  resurg  ..ce  of  Interest  in  machine  translation  is 
partially  attributable  to  the  emergence  of  a  variety  of  new 
paradigms,  ranging  from  better  translation  aids  and  improved  pre- 
ana  post-editing  methods,  to  highly  interactive  approaches  and 
fully  automated  knowledge-based  systems.Thts  paoer  discusses  each 
basic  approach  and  provides  some  comparative  analysis. It  is  argued 
that  both  interactive  and  knowledge  based  systems  offer 
considerable  promise  to  remedy  the  deficiencies  of  the  earlier, 
rore  ad-hoc  post-editing  approaches 
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arxane-78  has  been  used  for  years  at  GETA  as  tn*  underlying 
programming  environment  for  writing  many  MT  systems  for 
subsystems,  in  a  set  of  specialized  (rule  based)  languages  for 
linguistic  programing  (SLLP).The  authors  present  its  recent 
evolution,  whlcn  has  been  prompted  by  the  feecoack  from  tne  users, 
and  has  led  the  implementors  to  a  deep  resnaping.ln  particular, 
the  control  structure  of  the  entire  environment  has  been 
parametrized  to  a  large  extent,  due  to  the  introduction  of  a 
specialized  (finite  state  based)  language  used  for  describing  sets 
of  possiblo  sequences  of  linguistic  processes  (  prases),  such  as 
structural  analysis  or  lexical  expansion 
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The  author  presents  the  general  architecture  of  a  production 
environment  which  is  specific  for  a  M(A)T  system.  ana  give*  son* 
proposals  to  Integrate  new  functionalities  In  this  system. A  good 
management  of  the  results  of  the  translation  process-may  lead  to 
an  easier  improvement  of  tne  linguistic  cata.he-cescr loss  a 
posslDlt  organisation  for  the  machine  environment  of  such  a  system 
for  the  management  of  tht  data  cast  of  ttxts.f Inally,  ht  gives 
some  central  rults  for  tnt  Incitement  at  ion  of?i  monitor 
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The  background  to  this  paper  Is  the  attempt  within  EUROTRA  to 
develop  a  general  framework  for  research  and  development  work  in 
mt.  providing  tiTpart icular  an  environment  wnicn  facilitates 
reasoning  aoout  the,  relationships  Detween  tne  representations  tnat 
are  necessary  for  automatic  translation  Detween  natural 
languages. The  cx>re  immediate  Background  is  tne  attempt  to  apply 
this  framework  experimentally  on  a  snail  scale  In  developing  a 
'proto-EUROTRA'.Tnis  paper  gives  a  reasonably  clear  leer  aoout  tne 
user  language  ana  theories  of  reprosentat ion  for  this  experiment, 
ana  to  indicate  en  route  some  of  tne  directions  for  further 
work. it  resorts  work  in  progress,  and  is  tnus  deliberately 
speculative,  programmatic,  and  rather  informal 
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Tne  following  topics  were  dealt  with:  EUROTRajasianE; expert 
systems; lexicon  driven  MT;gramnars:suolanguage:inference  and 
context ;TRAN$LA?Oft;natural  language  processing. linguistics. ana 
metal. Abstracts  of  individual  papers  can  be  found  under  the 
relevant  classif ication  codes  in  tnls  or  other  issues 
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Tne  automatic  generation  and  analysis  '*  Chinese  is  tne  central 
topic  of  machine  translation  in  China. This  paper  Describes  tne 
intermediate  constituent  metrod  tnd  the  Jogico-senantic  method  for 
the  generation  of- Chinese,  it  describes  also  the  multi-label  rnd 
multi-branch  tree  method  for  the  analysis  of  Cnlnese.At  the  same 
time,  it  deals  with  some  special  problems  in  Chinese  such  as  the 
generation  of  measure  words  and  tho  reorganization  of  the  word 
orcer  of  eulti-sociYltrs 
C732<Jj  CI250 

computational  linguist !cs;comouterlsed  pattern  recognition; 
language  translation 

Cninesejaachine  translation;Cnina;tntermeoiate  constituent  oethpjd; 
logico  semantic  method;mult1  branch  tree  met  hod; measure  worcssword 
order 


C8WH049 

“Co«Dut*r“-“alcea“  ••translation**  at  WCC 
PERSCHEID  M.  M. 

Journal  paper 
Practical 
ENG 
ZZ 

CALICO  J. (USA) 

VOL.  3;  no.  1;  PP.  22-4;  0  Ref.;  OP.  Sept.  1985 

CALJE8 

0742-7778 

Many  individuals,  companies,  and  government  agencies  need  to  have 
a  large  volume  of  foreign  language  printed  matter  translated  for 
tneir  use. • ‘Computer* •-••ai cea“  ••translation**  offers  tne 
advantages  of  speed  and  volume  over  the  normal  non-asslsted  human 
translation  process, w« loner  is  one  company  wnicn  recognized  tne 
needs  in  this  particular  area  and  has  developed  Doth  hardware  and 
software  to  fill  this  need. Constant  improvement  and  attention  to 
detail  is  needed  to  keep  such  a  system  operating  at  top 
accuracy  It  is  weioner's  goa»  to  stay  on  top  of  all  advances  in 
tne  field  as  well  as  to  offer  a  complete  line  of  language  ana 
translation  services  to  tne  community 
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Some  computational  linguistics  projects  at  CCL  are  out  lined. They 
Include  Eng) i sn- Japanese  machine  translation.  MT  systems,  and  work 
oo  EUROTRX 
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Computational  linguistics  projects  at  Saarorucken  university  are 
out  lined. These  include  tne  SUSY  translation  system,  the  ASCOR 
system,  and  the  TEXAN  text  analysis  system 
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Principal  movements  for  the  research  and  development  on  machine 
••translat ion**  in  Europe  and  America  are  described  based  on 
reports  from  1982-3. Tne  terminology  data  bank  of  tno  Commission  of 
the  Eurooean  communities  (CEC),  machine  ••translation**,  tno 
EUROTRA  project,  machine  ‘‘translation**  project  of  tne  French 
Government  (ADI  TAO-ESOPE),  a  machine  “translat Ion**  expe-fmental 
system  aRIanE-78,  a  uacnine  “translation**  system  called 
••TITUS**,  a  terminology  data  bank  called  LEXIS  and  a  terminology 
data  bank  TERMiuu  are  outlined 
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••TITUS**  IV  1$  a  *achlne-“translation“  system  which  ha*  evolved 
from  the  needs  of  multilanguage  oocumentat ion, The  text*  are 
formulated  a*  abstracts  frc«  documents  and  can  Be  revered  in 
Cere-an,  English.  French  or  Spanish. They  art  simultaneously 
translated  Into  the  other  language*. In  orcer  to  minimise  the 
difficulties  union  result  from  the  complexity  of  natural 
languages,  the  system  is  cased  upon  a  controlled  syntax. The 
••translation**  ensue*  exclusively  in  dialogue. Any  uncorrected 
clauses  require  interaction  with  tne  editor. The  elements  of  the 
dictionaries  and  related  syntactical  structures  are  described  in 
detail 
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a  retnod  aimed  at  eventual  full  automation  of  machine 
••translation**  of  technical  documents  is  discussed,  wherein  tne 
editor  uses  a  cc**oot#r  to  sort  out  and  correct  textual 
as«lguities.In  particular  thru,  approaches  (ITS.  “TITUS**,  and 
EPISTLE)  are  considered  In  some  detail 
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The  “TITUS**  4  system  was  originally  designed  to  product 
abstracts  In  tne  fo-a  of  sentences  or  pnrases  written  in 
controlled  syntax.lt  fa  now  being  improved,  pa-tty  to  give  tne 
user  more  flexibility  in  writing  sentences,  and  partly  so  that  the 
system  can  oe  implemented  in  other  fields  than  abstracting 
services.  leprovemenw  being  introduced  to  enhance  “TITUS**  4  s 
versatility  include  multiple-clause  sentences. Certain 
restrictions,  however,  remain  owing  to  linguistic  proolem* 
associated  with  “translation**  from  one  languago  to  another 
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attitudes  and  training 
LOFFLER  UURIAN  A,  M. 

CNRS,  Paris,  France 
Journal  paper 
General;  Practical 
?RE 
FR 

Contrast**  (France) 

spec:  ser.AA;  PP.  43-67:  16  Ref,;  0®.  Jan.  1904 

CNTROO 

O247-016X 

The  autnor  explains  that  oacnlne  **transiatlon**  (UT)  is  not  yet 
acceptaol#  without  post-editing  which  is  examined  in  some 
deotnsHow  ouen  should  the  post-editor  intervene,  wrvo  is  he  and 
what  training  has  ne  nadTIn  *  brief  look  at  Systran  and  **Tltu*** 
sne  mentions  evaluation  and  Quftllty-Sasessment  of  translations, 
the  EUROOICAUTOM  data  case  and  the  pros  and  cons  of  language  pair 
systems  as  opposed  to  multilingual 
C7820;  C7130. 

government  data  processing: language  translation 


language  pair  systems;multilinguai  systems;p#ripnera  Revaluation: 
post  edlting;attUUdes;tra1n1ng;macntn*  translat1on:MT:$ystran; 
Tttus;Quality  assessment ;translat ions aCUROOICAilTOM  data  base 


C84041252 

Computer  assisted  **translation**  (TaO)  at  the  Centre  de 
Documentation  Sclent  if icue  et  Technique  (CDST)  of  the  Centre 
National  de  la  Recherche  scientiflque  (Cnrs) 

OETEWPLE  A, 

Journal  paper 
Practical 
ENG 
22 

wultlllngua  (Netherlands) 

VOL.  2;  NO.  4;  PP.  109-94;  3  Ref.;  OP.  1983 

WULTOF 

0167-050? 

0 1 67 - 8 507/ 83/0002-0 1 09 S 2 . 00 

Putting  the  PASCAL  documentary  ►-  ■*  onto  a  multilingual  footing 
poses  a  difficult  problem. Sever*  baputer  “translation**  systems 

likely  to  provide  suitable  solutions  have  been  tested  in  recent 
years  at  the  Centre  cc  Documentation  -Sclent ifioue  et  Tecnnioue 
(COST)  of  me  French  Centre  National  ce  la  Recnercne 
Sclent  if iQue. There  are  two  systems  already  operational  that  algnt 
be  able  to  produce  translations  suitMle  for  COST 
purposes. “TITUS**  IV.  ceviseo  by  the  Instltut  Textile  de  France. 
Is  semi conversational  and  has  certain  incut  restrictions,  so  that 
it  takes  about  30  minutes  to  input  a  50  to  60-woro  summary. The 
American  SYSTRAN  system,  of  which  the  EEC  has  acquired  the  rignts 
for  certain  European  language  pairs.  Is  fully 
computerized. However,  it  requires  a  certain  amount  of  postediting 
and  the  staff  of  the  COST  is  currently  attempting  to  establish 
exactly  now  auen  Arong  tne  other  systems  studied,  ALPS,  marketed 
in  France  oy  Control  Data  under  me  name  of  transmaTIC.  involves 
conversational  processes  designed  to  deal  with  all  types  of 
language  processing.tne  second  generation  tools.  GETA  and  Sygmart 
offer  the  most  possibilities 
C7820;  C7430:  C6110 

language  translation:program  and  system  documentation 
COST: CNRS: PASCAL  documentary  base;co«puter  translation  systems; 
TITUS  Iv.'Amerlcan  SYSTRAN  system:po$tediting;ALPS;Contro1  Data. 
TRANS MATIC:G£7A;SYGMART 
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“TITUS**  IV:  a  system  for  the  automatic  and  simultaneous 
“translation**  of  four  languages 

Information  Management  Research  in  Europe. Proceedings  of  the 
CURIM5  Conference 
Versailles,  Franc* 
way  1982 

DUCROT  J  M, ;  TAYLOR  P.  J  (Ed.)*.  CRONIN  8.  (Ed.) 

Inst. Textile  de  'ranee.  Boulogne-BIllancou't.  Franc* 

Conference  paper 
Application;  practical 
FRE 
FR 

AsllD:Lonoori,  England 

NR,  212:  PP.  177-95:  0  Ref.;  OP.  1983 

0-85142-171-7 

••TITus**  IV  is  an  automatic  “translation**  technique  intended  to 
manipulate  scientific  and  technical  articles  with  te-as  <n  Gercan, 
English,  French  and  Spanish.Baslc  concepts  of  controlled  areas 
representing  tne  vocabularies  and  tne  syntactic  rules  arc 
described. Tne  procedure  consists  of  feeding  each  input  sentence 
vftn  a  language  coce  to  a  multilingual  lexicon,  forming  a  pivot 
language,  indexing,  transforming  the  grammar  and  producing  2,  3  or 
4  translated  texts. Tne  vocabulary  includes  subgroups  of  specialist 
terms. Tne  pivot  language  is  In  binary  form. Tne  lexicon  caters  for 
4  main  grammatical  forms,  namely  substantives,  adjectives,  verps 
and  adveros,  although  there  are  several  other  groups. Error 
messages  aro  printed  in  all  four  languages,  enabling  the  operator 
to  redraft  the  input  as  required.Appiications  of  the  technique 
nave  been  made  using  an  IBM  4311-12,  tne  major  processing  of  eacn 
document  occupying  only  2.5  stcs.of  C®U  time. At  tne  terminal, 
however,  a  typical  input  and  output  schedule  would  be  25 
documents.  Including  blbllograpnic  references,  during  6  hours  work 
C7820 

language  translation 

simultaneous  translat1on;language*;auto«atic  translation, German. 
Eng)isn;French:Span!sn.vocaDularies>symact!c  rules, input  sentence, 
multilingual  lexicon;p1vot  language; indexingsgramar -substantives, 
adject ives : vercs ;adv«rps 
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Information  system  “TITUS** 

GURTLER  2. 

Suoro,  Prana,  Chechoslovakia 

Journal  paper 

General 

CZE 

cs 

Cesk.lnf .Teor.i  Praxe  (Czechoslovakia) 

VOL.  21:  NO.  6;  PP.  175-SI;  4  Ref.;  OP.  1979 
CITPBH  4 

General  characteristics  of  the  system  are  given  together  with  the 
description  of  tn*  variant  “TITUS“  II  wnich  is  presently 
used. The  variant  based  on  a  formalized  documentary  language  can  tie 
processed  by  the  computer  and  makes  the  automated  ••translation** 
of  processed  document  records  stored  In  the  computer  memory  into 
German,  English,  spanisn  and  French,  possible 
C72IO:  C702O 

information  services  language  translation 

TiTUSsdocumer.tary  language: automated  trans’ationjCtrmanjEngltsh; 
Spanisn;Frencn 
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0790312*2 

fne  dJviiop(!*nt  of  the  '**IITUS**'  four-language  cutomatlc 

••translation**  method 

SUlfF  S. 

journal  paper 

Application;  Practical 

FRZ 

Zl 

inf.S  Doc. (France) 

NO,  PP,  20*4;  0  ftef.;  OP.  Kay  1979 
COIOAO 

'••TITUS**  II'  is  essentially  based  on  a  documentary  language 
KOicn  U  a  simplified  ana  formalisms  fora  of  natural  language. too 
author  describes  toe  vocabulary  principles  used  and  tot  standard 
structure  of  pnrases. Tne  moce  of  operation  \s  schematically 
Illustrated  ana  tne  role  ana  toe  output  of  translators  i$  described 
C7820 

language  translation 

TITUSjautomatlc  translation  method; document ary  languagetnatural 
language;four  language  translation 
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Experiences  with  ••TITOS**  II 
ZINCtf.  H.  J. 

ZTDI,  Dusselcorf,  Germany 
Journal  paper 
General;  Practical 
ENG 
OE 

lnt.Class\f .(Germany) 

VOC.  5;  NO.  I {  PP.  33-7;  0  Ref, ;  OP  March  1978 
Description  of  the  international  cooperative  documentation  system 
called  ••TITUS**  (Textile  Information  Treatment  Users'  Service)  in 
its  previous  and  present  fo*m  (**TITUS**  II). It  uses  a  special 
linguistic  way  of  automatic  ••translation**  of  abstracts  and  index 
terms  (with  a  controlled  vocabulary  and  a  controlled  syntax)  in 
order  to  supply  users  of  the.Englisn,  french,  German  or  Spanish 
language  with  abstracts  in  tnelr  native  language  from  inputs  In 
one  of  the  other  languages 
07820;  072*0 

information  anatysis;language  translation 
TITUS  II {International  cooperative  documentation  system; Text  He 
Information  Treatment  Users  Service; automatic  translation; 
abstracts; index  ter®s;Engl1sh;FrcftCh;(Jermjn;Spani5h 
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14.  Abstract 

The  aim  of  this  Lecture  Series  is  to  show  how  computer  assisted  translation  (CAT)  can  be  of 
benefit  not  only  to  information  managers  but  also  to  end-users.  Existing  systems  will  be  described 
as  well  as  the  nature  of  the  texts  to  be  processed,  the  technical  and  human  problems  related  to  the 
use  of  such  systems  and  the  needs  of  end-users  (quality  level  of  translations,  information 
acquisition  in  the  mother  tongue. . .).  Examples  of  on-going  applications  and  systems  under 
development  will  also  be  presented.  These  examples  will  highlight  the  benefits  documentation 
centres  will  derive  from  CAT  and  suggest  solutions  of  interest  to  the  end-user. 
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